Play it again Sam

On roasting lamb and reproducibility in science – Part 1

Have you ever had the experience of turning out the perfect roast dinner, with tender lamb, crispy potatoes with fluffy insides, and perfect golden brown pumpkin, all beautifully timed to arrive at the table at once, delighting yourself and your guests?

… And have you had the experience of never being able to quite nail it the same way again? Every subsequent roast doesn’t quite reach the same pitch of delicious amazingness. Same oven, same temperature, same process…but tough meat comes out one time, a bit too rare on another. Occasionally the potatoes come out from the oven as hard as they went in, and there are times when the pumpkin collapses into a soggy mess that simply refuses to go brown.

If you have experienced this, you are experiencing problems with reproducibility. This is the exact same phenomenon as the scientist who cannot precisely replicate their experiment – even when they seem to do it exactly the same way.

Reproducibility is regarded to be the hallmark of great science. What it means is that an experiment yields the same results when it is repeated using the same method, either in the same lab, or in a different one; or with the same scientists or with different ones.

If the results are the same or similar enough when the study is done more than once, then it is reproducible. This is a big deal, regarded as the gold standard of high quality science. It means that the results are believed to be reliable and therefore they make a solid foundation that can be built on with further experiments.

This might sound simple, but it is remarkably difficult to achieve. If you are unconvinced about how difficult it is for a scientist to do the same thing exactly the same way, just consider the “perfect meal” example given above. Even the most experienced cook or scientist, with a solid, consistent technique will agree that there are unaccountable factors that will stymie even their best efforts to “play it again” perfectly.

There are many factors that affect our capacity to reproduce experiments. These come from the very uncontrollable world around the scientist. Scientific experiments are very subtle and refined processes. The slightest change in a single parameter can literally annihilate an experiment, sending the researcher back to square one, or falsely casting it as a major breakthrough, when all along it was just a big fat error.

An incredibly tiny quantity too much of this or that has the potential to throw a result out completely.
The freshness of a batch of chemical, or using a different batch number of a chemical can have profound effects on very subtle experiments.
Humidity can render different results that have given the impression that they are a really significant finding…but later turn out to be a glitch related to particular weather conditions, but un-reproducible when it changes.
Scientists have found the oddest factors have impeded their ability to conduct reproducible science – an opened lab window when it should have been shut, the tiniest air leak stymying a bacteria breeding process so that an experiment cannot be verified, temperature affecting chemical behaviour, different groups of animals responding differently at different times of year, distilled water in different laboratories having different effects.
When people are the subjects of a study, they can do things that totally throw a study for a loop – taking a medication they shouldn’t, eating food they are not supposed to, or just failing to turn up for appointments.
People make mistakes. For example scientists don’t always record their methods accurately, or they might use the wrong chemical. Senior scientist might miss errors in the way experiments are done and the way they are analysed by junior staff.
Bias, conscious or unconscious, will result in scientist’s actions subtly (and sometimes not so subtly) “coaxing” results in the preferred direction. Give the experiment to different scientists, unflavoured by that particular bias, and the results will not be reproduced.
The same applies to vested interests which have a very concerning impact on results, especially in an era where much research is funded by private organisations and replicated by the same organisation if replicated at all.

Establishing the validity of an experiment by “playing it again” is fraught with a number of institutional problems , not least being that it is so very expensive and time consuming to do science properly. It is hard enough to get the money for a fresh new bit of research, let alone receiving more, scarce money to do something that somebody has already done all over again. The result of this is that very few studies are actually duplicated.

There is a strong push right now within the scientific community to make reproducibility a reality for all published material and not just a nice aspiration or admirable theoretical concept. This push follows a series of embarrassments of epic proportions, in which prominent researchers played fast and loose with their experiments, results and conclusions – in some cases simply making them up.

The result has been embarrassment at personal and institutional levels. Worse still, these exposures have happened hot on the heels of excitement and celebration over major breakthroughs that have been found profoundly flawed at best, or a complete fabrication at worst. All of this plays out in the public arena in this day and age when science is becoming cool, sexy and newsworthy.

A major detrimental effect of these shenanigans is that people, including scientists, are losing trust in science.

The prestigious journal Science recently exposed the huge problem of reproducibility – or rather the stark lack of it – in modern science. They published a study in which a large group of collaborative researchers (Aarts, A., Anderson, J., Anderson, C., Attridge, P., and Attwood, A. et al, 2015)^[1] tried to reproduce 100 experiments that had been originally published by three highly esteemed psychology journals, Psychological Science, Journal of Personality and Social Psychology, and Journal of Experimental Psychology: Learning, Memory, and Cognition, in 2008. Following methods as precisely as possible, and going so far as to contact the original authors for their materials, they found a substantial decline in the measured effects in the reproduced studies. Some studies fared better than others, but this result is alarming when we consider how little science we rely on has been confirmed in this way.

One man pushing for increased reliability and reproducibility in studies is the scientist John Ioannidis, now based at Stanford University in the US. Ioannidis stunned the scientific world when he published the 2005 paper “Why Most Published Research Findings Are False”^[2] – truly a shocking claim, made painfully real by its validation in the passing years.

Ioannidis has called on those in the field of science to at least acknowledge bias and vested interest, and to work to consciously eliminate it.

Perhaps these suggestions are easier said than done, because the systems in science are very deeply entrenched at an organisational and functional level. The wheels are turning far too slowly in executing the very necessary changes to address this problem. The resistance, both human and systemic is strong.

It is a very serious problem when we consider how profoundly science impacts every aspect of our lives – from the technology we use, to the food we eat and the medicines we take. Science has a very strong authority stamped upon us, and a profound influence on our choices and ways of thinking. It is disturbing that this authority, and the strength of its claims rest upon the reproducibility that is simply not happening, even to the most meagre degree.

We all need to be aware that we cannot hand our power away to science with a trust that is blind and deaf to this substantial problem. In part 2 we will look more deeply at reproducibility and question what is the value of reproducibility in a world in which everything is evolving.

References:

[1]
Aarts, A., Anderson, J., Anderson, C., Attridge, P., and Attwood, A. et al (2015). Estimating the reproducibility of psychological science. Science. 349 (6251).
[2]
Ioannidis, J.. (2005). Why Most Published Research Findings Are False. Plos Medicine. DOI: 10.1371/journal.pmed.0020124, retrieved from http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124.

Filed under

Evidence-based, Bullying, Corruption, Research, Evidence