The Triple Helix @ UChicago

Fall 2016

"What Does the Replication Crisis mean for Psychology and Research?" by Julia Smith

 

With headlines like “Is Science Broken?”[1] and “Psychology’s Credibility Crisis,”[2] it is clear that people are questioning the fundamental reliability of the scientific process, and specifically that of psychology. Recently, the replication crisis – the realization that many published scientific findings cannot be replicated – shook up the world of research. Psychology’s replication crisis is currently in the spotlight, but it has implications for all scientific research. The replication crisis is a divisive subject because it calls into question the validity of previously accepted discoveries in psychology and, by extension, that of future studies. The discovery of the replication crisis certainly poses problems, but, having been reminded of research’s shortcomings, scientists are placing a renewed emphasis on skepticism and rigor. 

Replication is the process of re-creating an experiment to see if its findings can be reproduced. Reproducibility is a cornerstone of science: a finding is not considered valid if replicated experiments do not consistently yield the same results. There is a spectrum of replication types, with exact replication on one end and conceptual replication on the other. In exact replication, researchers follow the procedure of the original experiment to the letter. In conceptual replication, researchers test the previously established principle but diverge from the experimental methods of the first study.[3]

Journals want to publish original research, so replication studies were not considered worthwhile until it became clear that psychology was in the midst of a replication crisis. Researchers first uncovered the replication crisis by meticulously reproducing old experiments and treating published findings with a healthy dose of skepticism. This increasing skepticism of research has grown over the past decade or so. In 2005, Dr. John Ioannidis published a well-known paper entitled “Why Most Published Research Findings are False.”[4] This paper used a mathematical model that took into account factors such as a study’s sample size, effect size, flexibility in experimental design, and bias to support the title’s proposition. This article raised awareness in the scientific community about the potential for error, pinpointed some contributing factors, and encouraged researchers to practice skepticism. Though there was some doubt regarding the reliability of Ioannidis’ model, it was, nevertheless, an important reminder that our research practices are not infallible; “it opened up the possibility that, to the extent that this model holds true, […] many false positives would be published,”[5] says Dr. Ronald Thisted, a University of Chicago researcher who studies reproducibility. 

Another major step in this growing awareness – and the reason people know about the replication crisis – was the Reproducibility Project. In 2011, Dr. Brian Nosek, a University of Virginia researcher, started the Reproducibility Project, coordinating the reproduction of 100 recently published psychological studies. Dr. Matt Motyl, a University of Illinois at Chicago scientist who worked in Dr. Nosek’s lab and participated in the project, describes its inception: “In our lab we would talk about weird things in the literature and any time we would look at a paper that seemed weird or counterintuitive we were like ‘how does that happen.’” Often, they found that dubious statistics and sample sizes were to blame, but these observations were just idle talk until one day they came upon a study that supported Extrasensory Perception, the theory that some humans have a “sixth sense.” Skeptical of this proposition, Nosek’s lab and others ran replications of the experiment. They could not replicate the original results, but the same psychological journal that had published the original article would not publish their findings. Dr. Motyl recalls, “That increased our interest in trying to replicate things, and at that point we decided ‘Okay, let’s do this very scientifically. Let’s randomly sample 100 studies from some of the big journals in the field and then assign them to teams.’ Many of us did elaborate pilot tests and we talked to original authors and tried to get them to approve our materials. […] We thought that trying to replicate would be important for the field because science is supposed to be able to replicate. If it doesn’t replicate then something’s wrong there.”[6] In this way the Reproducibility Project began. Dr. Nosek and the 270 researchers who participated in the project were onto something: only 36% of the 100 findings could be replicated. Results such as these can make us wonder how these original findings were discovered and published. [7] 

There are a variety of possible reasons for contradictions between original findings and the findings from a replication study. Some of these explanations, such as p-hacking, implicate the scientists involved in the original study. P-hacking is the manipulation of data to get a p-value under .05 (which means the results are statistically significant and thus publishable). Despite the obsession in research with sub-.05 p-values, this alone is not always a sufficient standard of proof. [8] Other reasons for non-replication reflect weaknesses in the methods of the original study: perhaps a sample size was too low or a factor other than the independent variable strongly influenced the results. It is also possible that the replication studies themselves had methodological flaws. Experimental design and working with data are thorny tasks and sometimes no one is to blame, but the fact remains that now psychology has a host of contradictions to explain.  

For many researchers, the replication crisis constitutes an important reminder to conduct rigorous research, but it does not mark the end of psychology – or science – as we know it. Psychology journals have taken steps to address the problems raised by the replication crisis. Though original findings are still favored, a journal is now more likely to publish replications. Journals also now allow more space for researchers to list their methods and calculations. This push towards greater transparency facilitates replications and incentivizes rigor over flashy findings with little backing. These promising new practices seem to be working, and Dr. Marc Berman, a cognitive psychologist from the University of Chicago, says the replication crisis reminded us that “Everybody needs to be more careful. Everybody needs to really rigorously evaluate their science and make sure that what they’re doing is replicable.” At the same time, Dr. Berman cautions, “You don’t want it to go too far the other way where scientists lose their creativity and their freedom. You’ve got to find that good balance.”[9] Hopefully, with a renewed sense of skepticism, the field can successfully re-equilibrate. 

References

[1] Woolston, Chris. 2015. “Online debate erupts to ask: is science broken?” Nature 519, 393. Accessed December 8, 2016. doi: 10.1038/519393f
[2] Horgan, John. 2016. “Psychology’s Credibility Crisis: the Good, the Bad, and the Ugly.” Scientific American. Accessed December 8, 2016. https://www.scientificamerican.com/article/psychology-s-credibility-crisis-the-bad-the-good-and-the-ugly/. 
[3] Noba. 2016. “The Replication Crisis in Psychology” Accessed November 26, 2016. http://nobaproject.com/modules/the-replication-crisis-in-psychology#content. 
[4] Ioannidis, John P. A. 2005. “Why Most Published Research Findings Are False.” PLOS Medicine. Accessed November 26, 2016. DOI:http://dx.doi.org/10.1371/journal.pmed.0020124. 
[5] Dr. Ronald Thisted in discussion with the author, November 2016. 
[6] Dr. Matt Motyl in discussion with the author, November 2016. 
[7] Open Science Framework. 2015. “Estimating the Reproducibility of Psychological Science.” Last modified March 3, 2016. https://osf.io/ezcuj/wiki/home/.  
[8] FiveThirtyEight Science. 2015. “Science Isn’t Broken.” Accessed November 26, 2016. http://fivethirtyeight.com/features/science-isnt-broken/#part1.  
[9] Dr. Marc Berman in discussion with the author, November 2016.

 
UChicago Triple Helix