“What, you’re worried about a bunch of lay-people overseeing your surgery?” says Dr Kathleen Gray, senior research fellow in health informatics in Melbourne University’s faculty of medicine, dentistry and health sciences.
Gray is commenting on the startling results of some studies conducted in the US during the past year. The studies looked at ways to better evaluate the abilities of surgeons – a process that is rarely carried out.
Disturbed by this lack of oversight, Dr Thomas Lendvay, an associate professor of paediatric urology at the University of Washington, set up novel experiments in which groups of several hundred randomly recruited people with no medical qualifications were asked to view videos of surgeons at work and rate their performance against a set of criteria. Other surgeons were invited to do the same thing.
In each case, the amateur consensus and the expert view were pretty much the same. The main difference was that the crowd-sourced results were available on average within 24 hours, while the medicos took 24 days.
The fact that an amateur aggregated judgment in a field as specialised as surgery turns out to be on the money seems counter-intuitive to most people, but comes as little surprise to Gray and many others who are working in the burgeoning field known as medical crowd-sourcing.
Indeed, Karl Mattingly, a Melbourne-based crowd-sourcing entrepreneur and former banker, saw a familiar pattern in the results. “We use something similar to do “judgmental risk assessment” in credit risk and investment performance,” he says.
There are a couple of broad models to define the crowd-sourcing process. In the first, data provided by a vast pool of volunteers is gathered up and analysed. A good example of this approach is the National Breast Cancer Research Foundation’s LifePool project, which asks women to donate lifestyle information and mammogram data, thus building a huge data-set for researchers.
The second model – and the one that most interests Mattingly – involves utilising the opinions and decision-making skills of large numbers of volunteers. It is a concept that he dubs “the wisdom of crowds”, and in some cases it is challenging the entire concept of medical expertise.
Mattingly is the chief executive of a software start-up called slowVoice.He works closely with Professor Anne-Louise Ponsonby, who heads the environmental and genetic epidemiology research group at the Murdoch Children’s Research Institute in Parkville.
In July, the pair co-authored a commentary in the journal EBioMedicine, outlining the benefits and challenges of using crowd-sourcing in medical research.
They focused especially on a landmark study published this year, led by Professor Francisco Candido dos Reis, in which 98,000 lay-people were asked to look at 180,000 oestrogen receptor images and identify potential signs of cancer.
Their results – averaged and weighted in clever ways – were every bit as accurate as those of a group of pathologists asked to do the same.
Impressive as the results were, Mattingly believes they don’t express the full potential of using massed non-expert input in medical research.
Crowd-sourcing goes beyond the answers you can get to the definition of the problem and the framing of the question.
“Then you’ve got something that’s effectively a form of collective intelligence,” he says.
This month, Mattingly’s company unveiled its premium product, a crowd-sourcing platform called almanis. Although primarily a commercial venture, not-for-profit science and medical research organisations worldwide will be able to use it without charge.
And that’s good for public health physicians such as Anne-Louise Ponsonby over at the Murdoch institute, who sees crowd-sourcing as a potential game-changer. The new definition of “expert”, she suggests, will be the person who can best control a crowd – although she prefers the terms flock, or swarm.
Crowd-sourcing information and analyses, she says, introduces a cultural and intellectual diversity that is usually absent from even large specialist groups.
“With a big open group – say, a big teleconference – there is a danger that one ‘expert’, one extrovert, would infect everyone else and cause correlated errors.
“Crowd-sourcing is more like the idea of the hidden ballot. The madness of the mob arises when all participants are intercommunicating; collective intelligence arises when they are all acting independently.”
To Ponsonby, crowd-sourcing brings the twin benefits of muffling the extroverts while allowing the introverts to be heard.
“It’s important to have the over- and under-estimators in the name of cognitive diversity,” she says. “You don’t just want people who always hit the bullseye.”
With a good evidence base and a natural affiliation to the current “sharing” zeitgeist, medical research crowd-sourcing is booming. Software companies have been quick to formalise the processes by which scientists link up either with large numbers of non-experts or, sometimes, with anonymous flocks of their own peers.
The leader in the field, arguably, is a company called Kaggle. Researchers are invited to post their data on the site, pose a question, and let a potentially unlimited number of statisticians, data miners and enthusiastic amateur sleuths have at it.
Another site, CrowdMed, reverses the amateurs-helping-pros paradigm. GPs, or patients, are invited to post case histories on the site, after which a crowd of medical specialists offers opinions. CrowdMed boasts of solving “60 per cent of medical cases that stumped individual doctors”.
Earlier this year the Australian arm of a doctors-only social media network called Sermo launched. The company spruiks the fact that medicos are encouraged to ask anonymous questions about treatment options and receive crowd-sourced anonymous replies from members.
Computing giant Apple has also weighed into the field, releasing a crowdsource-friendly software program called ResearchKit, specifically geared to medical work. The open source program allows scientists to construct apps that optimise data collection from potentially very large numbers of study participants.
It is being used to study several conditions, including autism, epilepsy and melanoma, by institutions in the US such as Johns Hopkins and Duke University.
In Australia it has been adopted by the Black Dog Institute, a world-leading research body concerned with mental health, based in Randwick in Sydney.
Black Dog has been a global pioneer in linking up with people suffering from a range of mental illnesses, facilitating the real-time self-reporting of symptoms, and delivering automated interventions.
The institute encourages people to participate in the process through a mobile phone-optimised website called MyCompass, or through apps developed in-house, such as Snapshot and Biteback.
The apps, however, are two-way streets, with consenting participants providing what accumulates into a large and valuable pool of crowd-sourced data for research purposes.
“We’ve been using ResearchKit as a bit of a platform for developing interventions, and for gathering consent before trials, and to deliver questions,” said Black Dog’s director and chief scientist, Professor Helen Christensen. “It’s a way of saving a lot of effort.”
She said crowd-sourcing data depended on two critical values: transparency and truth.
“Our purpose in collecting data and delivering interactions is to improve mental health,” she said. “We’re very specific in what data we need from people in order to do our research and have it published in the highest quality journals.”
The institute is currently seeking participants in a range of crowd-source projects, including studies tracking social media related to mental illness, self-management strategies for people with suicidal thoughts, and the emotional experiences of people who have never suffered clinical depression.
Black Dog’s apps are just a few among a growing gallery of online mechanisms designed to measure and record physical and mental health. Fitness trackers, heart rate monitors, pedometers and so on now come as standard inclusions on new model smartphones and smart watches.
Many people use these devices to monitor their own wellbeing, to varying degrees of obsessiveness, but there is no shortage of researchers eager to encourage individuals to contribute their data to ever-growing deposits, there to be mined for research.
Crunching data, of course, is exactly what Google does, and the tech behemoth’s life sciences division is in the process of rolling out an ambitious information collection exercise called Base Line.
According to reports, Base Line will involve a staggeringly large number of people using a variety of applications to upload a wide range of individual data including not only things such as heart rate, but also genome-specific information.
“They want to build a Google-style model of what constitutes a healthy human being,” Gray says.
In a field that generally regards crowd-sourcing with what seems sometimes like the same naive wonder that fourth-graders regard the problem-solving power of search engines, Gray is a necessary sceptic. Not a denier; just a sceptic.
“I think we’re not there yet, in terms of both the design and function of the tools, and also in the design of the research around them,” she says.
“There is promise – there have been some startling large scale results – but the technology is quite immature.”
In a recent paper, she and colleague Bibin Punnoose looked at methods available to ordinary people for accumulating information measured by consumer gadgets. They identified several problems and concluded that “these factors limit the realisation of benefits for individual or population health.”
“Large amounts of moment to moment data potentially give new insights at a pattern level of population health. she says. “But there are real limitations in the design of the devices. Data doesn’t flow as it should.” This can lead to errors, which compromises the data.
Problems arising from various gadgets not talking to each other are the sort of thing, presumably, that will be ironed out over time as manufacturers agree on connectivity protocols.
Gray, however, has identified another issue arising from large-scale crowd-sourced data flowing in from fitness trackers – an issue that slaps against the need for “cognitive diversity” for conclusions to be valid.
Not surprisingly, fitness trackers are popular among serious athletes – hardly the most representative sector of most communities. There is, however, a second over-represented demographic.
“One of the limitations of using self tracking as crowd-source data is that our indications are that it attracts a fairly geeky type of person,” she says.
“It attracts some people because they like to play with tech toys. It is perhaps not a cohort whose data is directly useful for the people you most need to find out about it.”
That acknowledged, however, there seems little doubt that crowd-sourcing is changing society’s notions of what constitutes an expert, what comprises a definitive diagnosis, and how medical competence is measured.
In the not too distant future, perhaps, patient consultations will shift from being one-to-one interactions to become guided involvements with a matrix of the qualified and the lay.
“Certainly among my colleagues who are health professionals there is recognition that technology will influence scopes of practice, and enable new ways of receiving care – diagnosis, education, and follow-up, for instance,” Gray says.
“Changes are coming.”