Thursday, 17 December 2015

Putting your money where your mouth is: could betting fix science?

You may be aware of the current “replication crisis” in science at the moment.
If you aren’t, over the last decade human and biological sciences (psychology in particular) have seen a dramatic increase in the number of papers that are either poorly executed, are not successfully replicated, or are not replicated at all.
It has been estimated that these irreproducible studies waste $28 billion in
the US alone. 
There are many reasons why these poor studies are conducted and published. Job performance is a major issue, as the value of an academic is measured by the number of publications they write and the grant money they are able to bring in for their institute.
Many believe that this has led to a drop in the overall quality of manuscripts being published, as more and more researchers feel the pressure to publish positive results in order to progress in their careers or even keep their job at all. This means that negative results are rarely published and few people conduct replications of others’ studies for fear of not getting the results published.
For example, a now landmark study aimed to calculate the reproducibility of psychological studies recently published. The research group took 100 social psychology studies published in 2008 by the three most prestigious psychology journals and repeated them as closely to the original as possible. 
They found that, although 97 of the original studies showed a statistically significant effect, only 36 of the replication studies did so too, giving a total of just 39% of studies that were supported by the replication studies. There is hope though, as the higher quality studies were far more likely to be replicated successfully than poorly designed studies. Now there are obvious generalisations made here and the authors of the study are very aware of the limitations of conducting such studies, but it was the first in a number of calls to arms for improving the quality, repeatability and credibility of science.

Photo credit Logan Faerber

Now that the extent of the problem is clear, many academics have started to try and find ways of fixing it. This has led to the start of several journals solely dedicated to negative results, a new centre for open science and the Open Science Framework which allows readers to not only freely read the publication, but also easily see the raw data and analyses used in the study.
The same team of researchers that founded the COS may have found an interesting way of predicting whether a study will be successfully replicated. The researchers turned to prediction markets, similar to a stock market only that the ‘stocks’ are in studies that are going to be replicated within the next 2 months. 
The studies used in this research were part of the reproducibility study mentioned above.  To set the starting price of each study in the prediction market, they asked  experts to give an estimate of the chance they thought each study had of being replicated from 0–100%. They were also asked to estimate their knowledge of the area of study, to give weight to the answers.
When the market opened, the same experts bought and sold shares in the studies they had more or less confidence in their successful replication. They were able to correct their opinions in line with the rest of the group if they thought a study was reaching a higher price (and therefore a higher chance of replication) than they first thought.  The final share price of a study (up to a maximum of 100) indicated the predicted chance of reproduction from the collective group of experts.
This method of assessment predicted reproducibility of a study 71% of the time vs an average score of 58% in the initial survey of individual experts. The collective knowledge of the group allowed people to change their minds based on what others were thinking . This increased the accuracy of estimates by 13% over the expertise-weighted surveys.
This may not sound like much, but if you look at figure 1 in the study you can clearly see that the prediction market was much better at predicting the success or failure of a study than individual predictions. This is particularly true for studies where the experts were not too sure about a study individually (they rated it at around a
50% chance of successful reproduction), the market was very successful at subsequently coming to a much stronger decision.
Although it’s rather modest, further refinements of the market e.g. increasing the number of participants, prolonging the time the market was open for and the numbers of studies in the market, may help to increase the accuracy further.
I would be interested in seeing how experts treated new studies after seeing a table of previous markets. You can 
read the whole paper here  and the supplementary information here for free and you can see all the files and analysis free on the open science framework .  
The prediction market formed much stronger opinions over which studies would be replicated or not and with a higher rate of accuracy than individual experts even with a high level of knowledge in that area, allowing decision makers to make more informed decisions on which studies deserve support from funding bodies and publishers and creating a greater drive towards high quality and replicable science. Once larger markets are used, and repeated studies verify this as a valid method for predicting repeatability, journals could use this method to predict whether an article is worth publishing or not. Alternatively, funding bodies could use them to determine which studies are most likely to be reproducible and award funding accordingly.


Post a Comment