Skip to main content
Ads-ADVERTISEMENT-2
Watch after 35s
X

AI-Generated Junk Science Is Flooding Medical Journals — And No One’s Stopping It

In a quiet university café in the south of England, postgraduate researcher Tulsi Suchak stared at her laptop, brow furrowed. “Another one,” she muttered, “and it looks exactly like the five we saw last week.” Her supervisor, Dr. Matt Spick, looked up knowingly — this wasn’t the first time she’d made that observation.

At the University of Surrey’s School of Health Sciences, their team has long been interested in the use of big data in medical research. Originally, they wanted to understand how Open Science policies encourage data sharing. But over the past year, their work has taken an unexpected turn.

An explosion of studies using publicly available datasets — especially the U.S. National Health and Nutrition Examination Survey (NHANES) — caught their attention. On the surface, this seemed like a win for scientific productivity. Between 2022 and 2024, the number of NHANES-related studies on PubMed nearly doubled, from around 4,600 to over 8,000.

But as Tulsi and Matt dug deeper, something didn’t sit right. Many of these papers looked eerily similar — not just in format, but in methods, conclusions, and even titles. “We began to suspect that these weren’t traditional research articles,” said Matt. “They looked mass-produced.”

And with the rapid rise of generative AI tools like ChatGPT and increasingly automated data analysis using Python and R libraries, their suspicions seemed more than plausible. After all, it’s now easy to write a research paper — or at least something that looks like one — without deep subject matter expertise.

Their study, conducted alongside colleagues from Aberystwyth University, found disturbing evidence: a growing wave of "formulaic" research papers that follow a plug-and-play model. Many focused on overly simple statistical associations, such as linking one food item or inflammatory marker to one disease — ignoring the complexity of real-world health science.

The problems were more than stylistic. These studies often cherry-picked data, such as limiting the date range or selecting only a subset of patients without clear justification. Statistically, they failed to apply basic safeguards like false discovery rate (FDR) corrections — a method used to avoid mistaking random noise for meaningful results. When Matt and Tulsi retroactively applied FDR corrections to some of these studies, more than half of the findings fell apart.

Worse yet, they noticed a growing trend of papers presenting single-variable causes for multifactorial diseases. “It’s seductive to believe that one food or one habit is the key to avoiding disease,” said Tulsi. “But that’s not how health works.” These misleading oversimplifications, dressed up in academic language, are often picked up by media outlets or cited by future researchers — especially if they’re free to access and easy to understand.

The deeper concern, however, is that bad science may now be feeding itself. Because AI models are trained on open-access articles, they are learning from — and in some cases perpetuating — these flawed studies. “If generative AI is being taught that these are valid examples of scientific writing,” Matt warned, “we’re building a feedback loop of misinformation.”

The duo emphasized that while AI can be an incredible tool for accelerating genuine research, it also lowers the barrier for bad actors — including paper mills and opportunistic researchers — to flood journals with low-quality content. In the academic world of “publish or perish,” speed often trumps integrity.

“It’s like the junk food of academia,” Tulsi said. “It’s cheap, mass-produced, and designed to look appealing — but it’s not nourishing, and it can be harmful in the long run.”

So what’s the solution? Matt believes it starts with journal editors and peer reviewers taking a harder stance. “We need to stop rewarding papers that look good but offer little scientific value,” he said. Tools like COSIG (the Collection of Open Science Integrity Guides) can help readers and reviewers spot red flags — including repeated patterns, lack of transparency, or misapplied statistics.

But changing incentives will be much harder. The move toward open-access publishing, while well-intentioned, has created new challenges — such as article processing charges (APCs), which can unintentionally favor quantity over quality.

“We’re not anti-AI,” Tulsi clarified. “We’re pro-integrity.” She sees enormous potential for AI to enhance real research when used ethically — by reducing repetition, speeding up standard processes, and even identifying previously unseen patterns. But the technology is only as good as the motives of the people using it.

As AI becomes ever more embedded in the scientific process, Tulsi and Matt hope their findings serve as a wake-up call. Not all research is created equal. Some may not be created by humans at all.

And in an age of automated science, critical thinking — and a healthy dose of skepticism — might just be our last line of defense.