Close Menu
  • Home
  • United States
  • World
  • Politics
  • Business
  • Lifestyle
  • Entertainment
  • Health
  • Science
  • Tech
  • Sports
  • More
    • Web Stories
    • Editor’s Picks
    • Press Release

Subscribe to Updates

Get the latest USA news and updates directly to your inbox.

What's On
This ‘Light and Breezy’ Maxi Dress Is Becoming Amazon Shopper’s Go-to Summer Uniform

This ‘Light and Breezy’ Maxi Dress Is Becoming Amazon Shopper’s Go-to Summer Uniform

June 22, 2026
Sam Burns left to regret missing out on ‘special’ Father’s Day moment after more US Open heartbreak

Sam Burns left to regret missing out on ‘special’ Father’s Day moment after more US Open heartbreak

June 22, 2026
Southampton mayor ripped for gutting perks for town workers: ‘Straw that broke the camel’s back’

Southampton mayor ripped for gutting perks for town workers: ‘Straw that broke the camel’s back’

June 22, 2026
Facebook X (Twitter) Instagram
Trending
  • This ‘Light and Breezy’ Maxi Dress Is Becoming Amazon Shopper’s Go-to Summer Uniform
  • Sam Burns left to regret missing out on ‘special’ Father’s Day moment after more US Open heartbreak
  • Southampton mayor ripped for gutting perks for town workers: ‘Straw that broke the camel’s back’
  • Georgia pair charged with murder after bartender’s remains found in lake
  • NASCAR Racer Kyle Busch’s Widow Wasn’t Sure How to Celebrate Father’s Day: ‘Still Doesn’t Feel Real’
  • Wyndham Clark surprised by father after tumultuous US Open win that wasn’t one-man job: ‘What a warrior’
  • Chef who fled communist country opens world’s first Michelin-starred Cuban restaurant
  • ABC station mistakenly airs election results declaring Harris winner of key swing state
  • Privacy
  • Terms
  • Advertise
  • Contact Us
Join Us
USA TimesUSA Times
Newsletter Login
  • Home
  • United States
  • World
  • Politics
  • Business
  • Lifestyle
  • Entertainment
  • Health
  • Science
  • Tech
  • Sports
  • More
    • Web Stories
    • Editor’s Picks
    • Press Release
USA TimesUSA Times
Home » ‘Not how you build a digital mind’: How reasoning failures are preventing AI models from achieving human-level intelligence
‘Not how you build a digital mind’: How reasoning failures are preventing AI models from achieving human-level intelligence
Science

‘Not how you build a digital mind’: How reasoning failures are preventing AI models from achieving human-level intelligence

News RoomBy News RoomApril 2, 20262 ViewsNo Comments

Architectural constraints in today’s most popular artificial intelligence (AI) tools may limit how much more intelligent they can get, new research suggests.

A study published Feb. 5 on the preprint arXiv server argues that modern large language models (LLMs) are inherently prone to breakdowns in their problem-solving logic, known as “reasoning failures.”

Reasoning failures occur when an LLM loses track of key information needed to reliably solve a task, resulting in incorrect answers to seemingly straightforward problems. The paper, which was presented as a review of existing research, looked specifically at transformer models, a type of neural network architecture that underpins popular AI chatbots including ChatGPT, Claude and Google Gemini.


You may like

Based on LLMs’ performance on evaluations such as Humanity’s Last Exam, some scientists say the underlying neural network architecture can one day lead to a model capable of reaching human-level cognition. While transformer architecture makes LLMs extremely capable at tasks like language generation, the researchers argue that it also inhibits the kind of reliable logical processes needed to achieve true human-level reasoning.

“LLMs have exhibited remarkable reasoning capabilities, achieving impressive results across a wide range of tasks,” the researchers said in the study. “Despite these advances, significant reasoning failures persist, occurring even in seemingly simple scenarios … This failure is attributed to an inability of holistic planning and in-depth thinking.”

Limitations with LLMs

LLMs are trained on huge amounts of text data and generate responses to user prompts by predicting, word by word, a plausible answer. They do this by stringing together units of text, called “tokens,” based on statistical patterns learned from their training data.

Transformers also use a mechanism called “self-attention” to keep track of relationships between words and concepts over long strings of text. Self-attention, combined with their massive training databases, is what makes modern chatbots so good at generating convincing answers to user prompts.

Get the world’s most fascinating discoveries delivered straight to your inbox.

However, LLMs don’t do any actual “thinking” in the conventional sense. Instead, their responses are determined by an algorithm. For long tasks, particularly those that require genuine problem-solving across multiple steps, transformers can lose track of key information and default to the patterns learned from their training data. This results in reasoning failures.

It’s not real reasoning in the human sense — it’s still just next‑token prediction dressed up as a chain of thought

Federico Nanni, senior research data scientist at the Alan Turing Institute

“This fundamental weakness extends beyond basic tasks, to compositions of math problems, multi-fact claim verification, and other inherently compositional tasks,” the researchers said in the study.

Reasoning failures are also why LLMs often circle the same response to a user query even after being told it’s incorrect, or produce a different answer to the same question when it’s phrased slightly differently, even when it’s prompted to explain its reasoning step by step.


What to read next

Federico Nanni, a senior research data scientist at the U.K’s Alan Turing Institute, argues that what LLMs typically present as reasoning is mostly window dressing.

“People figured out that if you tell an LLM, instead of answering directly, to ‘think step by step’ and write out a reasoning process first, it often gets the right answer,” Nanni told Live Science. “But that’s a trick. It’s not real reasoning in the human sense — it’s still just next‑token prediction dressed up as a chain of thought,” he said. “When we say these models ‘reason,’ what we actually mean is that they write out a reasoning process — something that sounds like a plausible chain of reasoning.”

Gaps in existing AI benchmarks

Current ways to assess LLM performance fall short in three key areas, the researchers found. First, results can be affected by rewording a prompt. Second, benchmarks degrade and become contaminated the more they’re used. And finally, they only assess the outcome, rather than the reasoning process a model used to reach its conclusion.

This means current benchmarks may significantly overstate how capable LLMs are and understate how often they fail in real-world use.

LLMs’ performances may mean they have limited real world applications. (Image credit: da-kuk/Getty Images)

“Our position is not that benchmarks are flawed, but that they need to evolve,” study co-author Peiyang Song, a computer science and robotics student at Caltech, told Live Science via email. Likewise, benchmarks tend to leak into LLM training data, Nanni said, meaning subsequent LLMs figure out how to trick them.

“On top of that, now that models are deployed in production, usage itself becomes a kind of benchmark,” Nanni said. “You put the system in front of users and see what goes wrong — that’s the new test. So yes, we need better benchmarks, and we need to rely less on AI to check AI. But that’s very hard in practice, because these tools are now woven into how we work, and it’s extremely convenient to just use them.”

A new architecture for AGI?

Unlike other recent research, the new study doesn’t argue that neural-network approaches to AI are a dead end in the quest to achieve artificial general intelligence (AGI). Rather, the researchers liken it to the early days of computing, noting that understanding why LLMs fail is key to improving them.

However, they do argue that simply training models on more data or scaling them up are unlikely to resolve the issue on their own. This means developing AGI may require a fundamentally different approach to how models are built.

“Neural networks, and LLMs in particular, are clearly part of the AGI picture. Their progress has been extraordinary,” Song said. “However, our survey suggests that scaling alone is unlikely to resolve all reasoning failures … [meaning] reaching human-level reasoning may require architectural innovations, stronger world models, improved robustness training, and deeper integration with structured reasoning and embodied interaction.”

Nanni agreed. “From a philosophy‑of‑mind point of view, I’d say we’ve basically found the limits of transformers. They’re not how you build a digital mind,” he said. “They model text extremely well, to the point that it’s almost impossible to tell if a passage was written by a human or a machine. “But that’s what they are: language models … There’s only so far you can push this architecture.”

Share. Facebook Twitter LinkedIn Telegram WhatsApp Email

Keep Reading

The US just approved bemotrizinol, a sunscreen ingredient long used in Asia and Europe. Here’s how it works.

The US just approved bemotrizinol, a sunscreen ingredient long used in Asia and Europe. Here’s how it works.

James Webb telescope finds a cosmic cloud of creation buried in the Sword of Orion — Space photo of the week

James Webb telescope finds a cosmic cloud of creation buried in the Sword of Orion — Space photo of the week

What will the Amazon rainforest look like in 100 years?

What will the Amazon rainforest look like in 100 years?

‘You kill the bacteria and heal the wound at the same time’: Emerging nanotech could be the future of wound healing

‘You kill the bacteria and heal the wound at the same time’: Emerging nanotech could be the future of wound healing

Outdoor cats can be exposed to dangerous germs — here’s how to protect you and your pets, according to more than 400 studies

Outdoor cats can be exposed to dangerous germs — here’s how to protect you and your pets, according to more than 400 studies

Science news this week: Goblin shark filmed for first time, California close to a major quake, physicists split photon, and inside China’s plans to ‘tame nature’

Science news this week: Goblin shark filmed for first time, California close to a major quake, physicists split photon, and inside China’s plans to ‘tame nature’

Diagnostic dilemma: A fish stabs a man through the throat and the base of the skull

Diagnostic dilemma: A fish stabs a man through the throat and the base of the skull

Why does it take our eyes so long to adjust to the dark?

Why does it take our eyes so long to adjust to the dark?

Watch bison herd defend a newborn calf from wolf attack in a primeval Polish forest

Watch bison herd defend a newborn calf from wolf attack in a primeval Polish forest

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Sam Burns left to regret missing out on ‘special’ Father’s Day moment after more US Open heartbreak

Sam Burns left to regret missing out on ‘special’ Father’s Day moment after more US Open heartbreak

June 22, 2026
Southampton mayor ripped for gutting perks for town workers: ‘Straw that broke the camel’s back’

Southampton mayor ripped for gutting perks for town workers: ‘Straw that broke the camel’s back’

June 22, 2026
Georgia pair charged with murder after bartender’s remains found in lake

Georgia pair charged with murder after bartender’s remains found in lake

June 21, 2026
NASCAR Racer Kyle Busch’s Widow Wasn’t Sure How to Celebrate Father’s Day: ‘Still Doesn’t Feel Real’

NASCAR Racer Kyle Busch’s Widow Wasn’t Sure How to Celebrate Father’s Day: ‘Still Doesn’t Feel Real’

June 21, 2026

Subscribe to News

Get the latest USA news and updates directly to your inbox.

Latest News
Wyndham Clark surprised by father after tumultuous US Open win that wasn’t one-man job: ‘What a warrior’

Wyndham Clark surprised by father after tumultuous US Open win that wasn’t one-man job: ‘What a warrior’

June 21, 2026
Chef who fled communist country opens world’s first Michelin-starred Cuban restaurant

Chef who fled communist country opens world’s first Michelin-starred Cuban restaurant

June 21, 2026
ABC station mistakenly airs election results declaring Harris winner of key swing state

ABC station mistakenly airs election results declaring Harris winner of key swing state

June 21, 2026
Facebook X (Twitter) Pinterest WhatsApp TikTok Instagram
© 2026 USA Times. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.