Close Menu
  • Home
  • United States
  • World
  • Politics
  • Business
  • Lifestyle
  • Entertainment
  • Health
  • Science
  • Tech
  • Sports
  • More
    • Web Stories
    • Editor’s Picks
    • Press Release

Subscribe to Updates

Get the latest USA news and updates directly to your inbox.

What's On
Meta AI Incognito Chat promises private conversations even Meta cannot read

Meta AI Incognito Chat promises private conversations even Meta cannot read

May 21, 2026
‘Survivor’ Host Jeff Probst Speaks Out Amid Backlash Over Giving Away Major Season 50 Spoiler

‘Survivor’ Host Jeff Probst Speaks Out Amid Backlash Over Giving Away Major Season 50 Spoiler

May 21, 2026
Shohei Ohtani’s two-way heroics lift Dodgers to series win over Padres

Shohei Ohtani’s two-way heroics lift Dodgers to series win over Padres

May 21, 2026
Facebook X (Twitter) Instagram
Trending
  • Meta AI Incognito Chat promises private conversations even Meta cannot read
  • ‘Survivor’ Host Jeff Probst Speaks Out Amid Backlash Over Giving Away Major Season 50 Spoiler
  • Shohei Ohtani’s two-way heroics lift Dodgers to series win over Padres
  • How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they’ve found the answer.
  • MTA’s design blunder on garage repair costs taxpayers more than $500K
  • Sen Blackburn says sports betting hearing likely first of ‘several’ as Congress weighs federal action
  • Which ‘Selling Sunset’ Stars Are — And Aren’t — Returning for Season 10 Amid Cast Feuds?
  • Shai Gilgeous-Alexander bounces back in big way as Thunder top Spurs in Game 2 to even series
  • Privacy
  • Terms
  • Advertise
  • Contact Us
Join Us
USA TimesUSA Times
Newsletter Login
  • Home
  • United States
  • World
  • Politics
  • Business
  • Lifestyle
  • Entertainment
  • Health
  • Science
  • Tech
  • Sports
  • More
    • Web Stories
    • Editor’s Picks
    • Press Release
USA TimesUSA Times
Home » How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they’ve found the answer.
How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they’ve found the answer.
Science

How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they’ve found the answer.

News RoomBy News RoomMay 21, 20261 ViewsNo Comments

While the evolution of artificial intelligence (AI) systems has shown no sign of slowing, there’s a growing concern that large language models (LLMs) will soon run out of human-made data to ingest and learn from.

Once this happens, scientists say, AI models will increasingly rely on synthetic AI-made information, which will lead to an effect called “model collapse.” This is where LLMs spout gibberish and the AI systems they underpin deliver inaccurate answers and hallucinate information to queries far more commonly than they do today.

“That’s especially worrying considering some experts think that we will run out of high-quality human-generated data by the end of the year — so if you’re relying on this synthetic data, but there’s an almost existential threat it will sink your AI, you’re in trouble,” Yasser Roudi, a professor of disordered systems in the Department of Mathematics at King’s College London (KCL), told Live Science. “If, for example, you had LLMs that were used in hospitals to analyze brain scans and find cancers, if while training another model they experienced model collapse, these machines could misdiagnose people.”


You may like

However, Roudi recently found that model collapse can be bypassed by adding a single human-made data point to an AI’s training data, even if all the other data is AI-generated.

The study ‪—‬ which involved researchers from KCL, the Norwegian University of Science and Technology, and the Abdus Salam International Centre for Theoretical Physics in Italy ‪—‬ was published May 14 in the journal Physical Review Letters.

While AI model collapse hasn’t happened in a real-world scenario with an actively deployed AI system, anyone who uses tools like ChatGPT or Gemini to generate answers or text has very likely experienced errors or hallucinations. However, Roudi hopes the new findings might outline a method to sidestep this potential emergent threat.

Countering collapse

Beyond widely known hallucinations in primitive generative AI products, we may not have yet seen any dramatic examples of model collapse in the form of sophisticated AIs seemingly “going mad” and outputting complete nonsense. But signs of minor collapse could be observed when AI delivers increasingly inaccurate or bland answers to queries, or completely fabricates information while trying to generate some kind of output it assumes a user desires.

Get the world’s most fascinating discoveries delivered straight to your inbox.

By repeatedly training LLMs on data generated by other LLMs, the core truth and source of information ‪—‬ and spikes of variance between generations of models ‪—‬ get “smoothed out,” delivering homogenized answers and outputs. For example, text that might read well enough at first glance could lack any real detail or nuance. Essentially, model collapse can be split into ‘early’ and ‘late’ stages, where the former sees an AI lose the ability to serve up edge-case (rare and or less common) information and produce bland, synthetic-feeling responses, and the latter sees LLMs deliver gibberish information.

The huge scale of LLMs and the data they process can make it hard to establish how and why they hallucinate information, and how certain choices lead to model collapse.

To tackle this, the researchers used smaller models that belong to exponential families — a catch-all term for a number of probability distributions, like ascertaining the likely outcomes from random events. The bell curve is one such example, as is figuring out the chance that a coin flip will land on heads.


What to read next

“By looking at analytically tractable models such as the exponential families, you can answer those ‘why’ and ‘how’ questions,” Roudi said. “By that same logic, you can come up with ways to mitigate its dangerous effects, how those ways work, and ultimately apply them to real-life examples.”

The researchers discovered that by introducing a single external human-made data point to a pool of synthetic data used by a model undergoing closed-loop training, whereby a new model is trained on data generated by a previous models, they avoided model collapse.

Roudi said one example could be an AI-based image or video classifier, whereby an LLM is trained on data that includes a real image correctly classified by a human, rather than AI-generated media or media classified by an AI.

“In other words, this data point would be linked to a ‘ground truth,’ something we know undeniably to be true and independently verifiable,” Roudi said.

The next step for Roudi and the researchers is to apply this approach to larger and more complex models to see if this principle still holds true. This method could mitigate potentially “disastrous” scenarios of model collapse, especially within the AI models we use in everyday life, the team said.

“This research is the first step in setting out some ground rules for preventing this [from] happening in the future,” Roudi concluded. “While more work should be done, AI engineers making things like the next ChatGPT can use what we’ve found to develop models that don’t collapse.”

Jangjoo, F., Di Sarra, G., Marsili, M., & Roudi, Y. (2026). Lost in Retraining: Closed-Loop learning and model collapse in exponential families. Physical Review Letters, 136(19). https://doi.org/10.1103/156q-3ngc

Share. Facebook Twitter LinkedIn Telegram WhatsApp Email

Keep Reading

Complex animals evolved up to 10 million years earlier than previously thought, fossil discovery shows

Complex animals evolved up to 10 million years earlier than previously thought, fossil discovery shows

Scurvy-plagued whalers’ remains discovered at ‘Corpse Point’ on Arctic island

Scurvy-plagued whalers’ remains discovered at ‘Corpse Point’ on Arctic island

The Appalachian Mountains hold enough lithium to make 500 billion cellphones, researchers discover

The Appalachian Mountains hold enough lithium to make 500 billion cellphones, researchers discover

Some cancer patients don’t respond to immunotherapy. An existing asthma drug could change that.

Some cancer patients don’t respond to immunotherapy. An existing asthma drug could change that.

Physicists confirm ‘negative time’ is real in mind-bending quantum experiment

Physicists confirm ‘negative time’ is real in mind-bending quantum experiment

More young people are getting colorectal cancer — here’s what scientists think might be happening

More young people are getting colorectal cancer — here’s what scientists think might be happening

‘The system is likely to reach a breaking point’: Major Italian volcano is speeding toward a transition, and a major eruption could be on the way

‘The system is likely to reach a breaking point’: Major Italian volcano is speeding toward a transition, and a major eruption could be on the way

Diagnostic dilemma: A biopsy of a woman’s cancerous tumor caused it to vanish

Diagnostic dilemma: A biopsy of a woman’s cancerous tumor caused it to vanish

Celestron AstroMaster LT 70AZ review

Celestron AstroMaster LT 70AZ review

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

‘Survivor’ Host Jeff Probst Speaks Out Amid Backlash Over Giving Away Major Season 50 Spoiler

‘Survivor’ Host Jeff Probst Speaks Out Amid Backlash Over Giving Away Major Season 50 Spoiler

May 21, 2026
Shohei Ohtani’s two-way heroics lift Dodgers to series win over Padres

Shohei Ohtani’s two-way heroics lift Dodgers to series win over Padres

May 21, 2026
How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they’ve found the answer.

How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they’ve found the answer.

May 21, 2026
MTA’s design blunder on garage repair costs taxpayers more than 0K

MTA’s design blunder on garage repair costs taxpayers more than $500K

May 21, 2026

Subscribe to News

Get the latest USA news and updates directly to your inbox.

Latest News
Sen Blackburn says sports betting hearing likely first of ‘several’ as Congress weighs federal action

Sen Blackburn says sports betting hearing likely first of ‘several’ as Congress weighs federal action

May 21, 2026
Which ‘Selling Sunset’ Stars Are — And Aren’t — Returning for Season 10 Amid Cast Feuds?

Which ‘Selling Sunset’ Stars Are — And Aren’t — Returning for Season 10 Amid Cast Feuds?

May 21, 2026
Shai Gilgeous-Alexander bounces back in big way as Thunder top Spurs in Game 2 to even series

Shai Gilgeous-Alexander bounces back in big way as Thunder top Spurs in Game 2 to even series

May 21, 2026
Facebook X (Twitter) Pinterest WhatsApp TikTok Instagram
© 2026 USA Times. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.