Is Your Research Credible? Here's How to Check.
Many studies have bias or statistical issues that makes their findings false. Here's why, and various domain-specific resources to help understand the issue and guide credible work.
Although Dr. John Ioannidis published his landmark paper “Why Most Published Research Findings Are False” almost 20 years ago, the topic of “false research” hasn’t subsided much. If anything, it’s gotten more complicated, with faulty findings being unearthed, resulting in avalanche retractions, like the recent “disproving” of a common miscarriage prevention. Or, similarly, how the accumulation of under-powered studies in ecology leads to inaccurate meta-analyses.
The continuation of this issue is of little surprise, since the prevailing current of research is still driven by pressure to publish. Although we’ve covered the detrimental effect this has on various aspects of academia, in this issue we tackle the issue from a new angle. From scientists specifically avoiding certain topics to get published, to overwhelming numbers of papers failing to meet quality and reliability standards, the integrity of the scientific method is buckling under the weight of publishing and external pressures. Ultimately, this compromises the very pursuit of high-quality science.
However, since Dr. Ioannidis’s original paper, many organisations have cropped up with resources and guidelines for researchers to follow. In fact, unlike most issues today, growing awareness and resource management may be all it takes to transform this widespread problem into what may be a “secondary” scientific revolution. Now that we’re getting science done at scale, it’s time to do it right.
Statistical issues, bias, and exaggerations
When Dr. Ioannidis released his initial paper on the topic about two decades ago, it was clear that the issue of “false” research claims was already widespread. Although it generated a new wave of interest in the topic and brought to light the prevalence of the issue across all domains, some researchers have already been well aware. For example, a 1994 review paper covering three decades of psychotherapy research found continual misuse and misinterpretation of null hypothesis testing and p-values.
Today, increasing awareness is developing across domains, with new papers coming out focusing on the specific work of that field. For example, a recently published Nature paper reviewed 350 ecology papers published in the last few years. It found widespread issues of insufficient statistical power, corresponding biases, and failure to report certain variables in analyses. The premise was simple: given that statistically significant results are preferred over non-significant, and that only exaggerated findings are statistically significant in under-powered studies, there’s a clear and prevalent exaggeration bias across the discipline. The authors found that over half of the findings of under-powered studies are exaggerated 2-fold, if not more. Many of these issues were not malicious, but stem from a lack of understanding regarding credible empirical ecology.
This set of issues is thankfully the most “curable” so to speak, and simply requires greater awareness and understanding when it comes to applying statistical analyses, and what claims can be made. The more challenging, but equally important approaches, are those involving systematic chances, like tackling the incentives in place that reward more subtle forms of bias (i.e. publication bias).
Biases beyond statistics
Although one of the main issues in this topic is the actual methodology and statistical analysis of results, another prevalent and more subtle problem also permeates the academic landscape. This deals less with how research is done and more how it is reviewed and interpreted by the broader audience.
Back in 1966, Dr. David Lykken argued that statistical significance wasn’t enough for a good experiment, based on a recently published paper in the American Psychological Association journal in 1964. The original study “found” that patients who subconsciously believed in anal parturition were more likely to see frogs (cloacal animal) in the Rorschach psychological test. Not only was it a far-fetched theory, but the statistical significance of claims were lacking. Although it can be seen as an argument over frequentist versus Bayesian perspective (where the latter can incorporate an exceedingly low prior probability for the result), the true takeaway is much bigger. Lykken argued, back in 1966, “Editors must be bold enough to take responsibility for deciding which studies are good and which are not, without resorting to letting the p-value of the significance tests determine this decision”.

But how have editors changed in the last 57 years?
Just last month, Dr. Patrick Brown started a minor academic scandal when he posted his article: I Left Out the Full Truth to Get My Climate Change Paper Published. He felt that his safest bet was to focus on the climate change narrative, rather than pursue the other factors which also contribute to the increased risk of wildfires in California. Although he stands by his work, he feels that aligning strictly with the narrative of climate change makes work more publishable, and discourages other avenues of scientific questioning. In fact, he argues he is one of many researchers doing this, but may just be unique in openly acknowledging it. The whole case isn’t too different from the not-so-long-ago pandemic, and the public shaming incurred by scientists that called for more evidence-based decision making.
These cases serve to demonstrate how the system sometimes favours external factors (public opinion, political pressure, economical incentives, etc.) over unbiased, rational thinking, the very premise of “good science”. It’s not just external influences though, but the “gut feeling” that all people tend to favour when confronted with something that goes against their preconceived notions. A more recent example is a theory for consciousness that’s getting labelled as pseudoscience, even though some scientists feel it can be meaningfully tested.

A more serious example, plucked from history, is the story of Ignaz Semmelweis, the 19th century physician and “saviour of mothers”. He realised hand washing dramatically improved the chances that mothers survive childbirth, but his colleagues didn’t buy it. His ending was admittedly tragic. He couldn’t explain the phenomenon, although he had the empirical data to back it up, and ended up mocked by the community, committed to a mental asylum, and died from wounds received from being beaten by the guards.
So where does this leave us? It’s not just the widespread misuse of statistical analysis. There are also non-quantitative biases at stake, most of which are simply related to our human instincts and tendencies towards believing what is familiar. Ironically, this is the same issue scientists have when sharing findings with the general public. Although we can’t fault ourselves for our natural inclinations, we can better adhere to scientific principles and reduce bias as much as possible.
“Scientists make mistakes. Accordingly, it is the job of the scientist to recognize our weakness, to examine the widest range of opinions, to be ruthlessly self-critical. Science is a collective enterprise with the error-correction machinery often running smoothly.” - Carl Sagan in The Demon-Haunted World
Supporting credible research
There are many mainstream and emerging solutions, resources, and organisations that are tackling this issue from various angles. Generally, we classify these solutions into either A) technical fixes like how to effectively conduct statistical analysis across domains, and B) cultural and societal shifts, like rewarding “negative” studies more.
Improving statistical understanding
As the recent review on ecology studies demonstrates, a significant portion of the “false” findings in literature are not false because of their underlying scientific process (after all, these are peer-reviewed publications). Instead, they simply require a more rigorous application of statistical methods. The following are some resources and organisations that share the best and current standards in interpreting results, often focusing on statistical analyses:
Statistics Done Wrong is a scientist’s guide to the most common slip-ups done by “scientists every day, in the lab and in peer-reviewed journals”.
Ten Simple Rules for Effective Statistical Practice, a 2016 “Ten Simple Rules” paper in PLOS with easy tips, such as “signal always comes with noise”.
Science Forum: Ten common statistical mistakes to watch out for when writing or reviewing a manuscript, a 2019 article that is like a combination of the two previously listed resources.
The ASA Statement on p-Values: Context, Process, and Purpose, a 2016 American Statistical Association statement clearly defining p-value, with additional commentary.
Seeing Theory, an interactive website (and accompanying PDF) that introduces probability and statistics.
GRADE is the leading standard in “grading quality (or certainty) of evidence and strength of recommendations” in healthcare.
Changing culture
Improving the credibility of research in a systematic fashion means addressing the underlying issues that are compromising academic integrity. At the core, it’s the pressure to publish highly influential works which drives researchers to prioritise publishable work that they believe will get more citations. Publication bias from editors also contributes by favouring certain studies and suppressing others. Both are interlinked, and further connect to the reproducibility crisis, since negative results aren’t published as often and novel research is generally rewarded more heavily than replication.
One of the most compelling practices that alleviate publication bias is the use of Registered Reports (RRs). Over 300 journals already offer RRs, which shift the focus from outcomes towards methodology, which is determined and approved at the outset. Already, RRs are offered by over 300 journals, including those by Nature, The Royal Society, and Cortex. Since RRs involve a waiting period though, some (like these SORTEE ecologists) recommend to only use these for studies where researchers can afford to wait, and have a strong study rationale in place.
Organisations in this area focus on informing scientist about sources of bias and providing resources to the community on how to create transparent, reproducible and highly credible research.
The Catalogue of Bias Collaboration - Researchers compiling a Catalogue of Bias for health studies, following the work of David Sackett. Currently offer a database of biases with information and preventative guidelines for each one.
Project TIER aims to improve transparency and reproducibility in the social sciences with course materials and workshops.
The Collaborative Replications and Education Project (CREP) focuses on educating students on how to do high-quality replications in psychology.
Society for Open, Reliable, and Transparent Ecology and Evolutionary Biology (SORTEE) aims to “improve reliability and transparency through cultural and institutional changes in ecology, evolutionary biology, and related fields” by hosting events, providing resources, and creating a community.
Transparency and Openness Promotion (TOP) Guidelines are a set of transparency standards journals can adhere to. Offers corresponding TOP Factor which measures how well a journal implements open science practices (although many journals are yet unlisted).
Many more organisations exist that focus primarily on reproducibility. The list can be found in the resources section of our earlier article on the subject.
It’s worth noting that simply better incentivising peer review is also an important and likely highly effective solution to this issue. The main issue is how. DeSci presents some novel perspectives here, which we’ve covered before. But it’s also worth taking a leaf out of the book of open source projects. Thousands of developers across the world voluntarily dedicate significant time to high-impact projects like Linux, Python, TensorFlow, etc. Perhaps a similar cultural shift can be reignited within the academic community to make peer review contributions equally enriching. The biggest obstacle may simply be time. Institutions would need to reduce other obligations so researchers would have the time they need for peer review. Currently, many researchers decline review on account of being too busy. Allotting adequate time and creating a positive culture may be one of the most underrated and achievable solutions.
“I think that science is like swimming in an ocean, it is swimming in an ocean at night, so it is dangerous but at the same time it is extremely enjoyable. For someone who has been swimming in Greece at night, this is probably one of the best experiences that you can have. The least thing that you can imagine is that you might drown one day or there are maybe sharks around ... but it is what it is ... and we have to learn being able to swim in that ocean." - Dr. John Ioannidis, Out To See Film
See the Litmap of Dr. Ioannidis’s work.
Ending this article where we started, we turn back to that landmark paper by Dr. John Ioannidis. “Why Most Published Research Findings Are False” has been viewed over 3 million times and has over 7,000 citations. Along with other meta-researchers in the field, Dr. Ioannidis continues to question and build awareness around credible science, often at the cost of public opinion. His case isn’t unlike the other examples where researchers come under severe and even abusive fire for questioning contemporary practices (take Ignaz Semmelweiss historically, or Holly Lawford-Smith today). Of course, such questioning is a core tenet of scientific inquiry and skepticism, and the meta-research revival movement reflects a general awareness and readiness to respond throughout the community. This is evident by the relevant publications, organisations and initiatives. Researchers are taking this challenge in stride, seeking to uphold the pursuit of authentic and credible science.
Where do you stand on credible science? What factors do you think play a role, and what can researchers do to better address them? Feel free to share in the comments below, or get in touch. We’re always seeking experts and thoughtful researchers to contribute to The Scoop.
Resources
Maternal health points to need for oversight on scientific research, Sept 2023
Prominent Consciousness Theory Is Slammed as Bogus Science, Sept 2023
I Left Out the Full Truth to Get My Climate Change Paper Published, Sept 2023
Empirical evidence of widespread exaggeration bias and selective reporting in ecology, Aug 2023
Why Most Published Research Findings Are False, 2015
Statistics for researchers
Grading of Recommendations Assessment, Development and Evaluation (GRADE)
Ten Simple Rules for Effective Statistical Practice, 2016
The ASA Statement on p-Values: Context, Process, and Purpose, 2016
Organisations
The Catalogue of Bias Collaboration (health)
Project TIER (social sciences)
The Collaborative Replications and Education Project (CREP) (psychology)
Society for Open, Reliable, and Transparent Ecology and Evolutionary Biology (SORTEE) (ecology)
Transparency and Openness Promotion (TOP) Guidelines (for journals)
Framework for Open and Reproducible Research Training (FORRT)
Reproducibility for Everyone (R4E)
The Institute for Replication (I4R)
Association for Interdisciplinary Meta-research and Open Science (AIMOS)
Great article. Getting scientists to slow down would be a huge start. But the current hyper-competitive system won't allow this.