Close Menu
  • Home
  • United States
  • World
  • Politics
  • Business
  • Lifestyle
  • Entertainment
  • Health
  • Science
  • Tech
  • Sports
  • More
    • Web Stories
    • Editor’s Picks
    • Press Release

Subscribe to Updates

Get the latest USA news and updates directly to your inbox.

What's On
Taraji P. Henson on motherhood and her Broadway debut in ‘Joe Turner’s Come and Gone’

Taraji P. Henson on motherhood and her Broadway debut in ‘Joe Turner’s Come and Gone’

April 30, 2026
Vice President JD Vance Says WHCD Shooting Was ‘Tougher’ on His Pregnant Wife Usha Who Was at Home

Vice President JD Vance Says WHCD Shooting Was ‘Tougher’ on His Pregnant Wife Usha Who Was at Home

April 30, 2026
Don’t be fooled by Dodgers B offense… they’re not worth that

Don’t be fooled by Dodgers $1B offense… they’re not worth that

April 30, 2026
Facebook X (Twitter) Instagram
Trending
  • Taraji P. Henson on motherhood and her Broadway debut in ‘Joe Turner’s Come and Gone’
  • Vice President JD Vance Says WHCD Shooting Was ‘Tougher’ on His Pregnant Wife Usha Who Was at Home
  • Don’t be fooled by Dodgers $1B offense… they’re not worth that
  • ‘The detectors never stopped beeping!’ Nearly 3,000 coins discovered in field are Norway’s largest Viking hoard on record
  • I’m a doctor — women need to stop ‘white-knuckling’ through a common inflammatory condition
  • Biden admin ‘zealously’ probed ‘traditional’ Christians — even keeping tabs on priests: DOJ report
  • Tony Romo misses US Open qualifier after shooting 9-over 79 in Texas
  • Jessie James Decker’s style secrets, from sexy camisoles to chic shades
  • Privacy
  • Terms
  • Advertise
  • Contact Us
Join Us
USA TimesUSA Times
Newsletter Login
  • Home
  • United States
  • World
  • Politics
  • Business
  • Lifestyle
  • Entertainment
  • Health
  • Science
  • Tech
  • Sports
  • More
    • Web Stories
    • Editor’s Picks
    • Press Release
USA TimesUSA Times
Home » Google AI breakthrough means chatbots use six times less memory during conversations without compromising performance
Google AI breakthrough means chatbots use six times less memory during conversations without compromising performance
Science

Google AI breakthrough means chatbots use six times less memory during conversations without compromising performance

News RoomBy News RoomApril 30, 20261 ViewsNo Comments

Google engineers have developed a method to compress artificial intelligence (AI) data so that it requires up to six times less working memory to function.

With the new system, called TurboQuant, AI algorithms could retain the same amount of information and perform equally powerful computations, but with significantly less memory hardware, the company says.

Current AI algorithms need a lot of working memory, also known as the key value (KV) cache, to work properly. This is where immediate computational results and other bits of info are stored temporarily during active processing.


You may like

For example, if you ask ChatGPT what the weather will be like tomorrow in your area, it may store words like “weather” and “tomorrow,” along with your location and partial guesses, like “It might be rainy,” in the KV cache while it generates its response. The larger an AI model’s KV cache is, the more information it can keep track of at once and the more powerful it is.

A single sentence uses only a few dozen tokens — the building blocks of AI prompts and output text — but storing hundreds of thousands of tokens in the KV cache for more sophisticated work can require tens of gigabytes of memory. These memory requirements scale linearly depending on the number of users, and ChatGPT is known to receive billions of requests every day.

The compression algorithm will decrease the amount of working memory an AI model needs to perform the same computations. It does so via a process called quantization, which results in values represented by fewer bits.

Although Google has been using quantization on its neural networks for many years, it has typically been applied statically — that is, the compression is done once and doesn’t change as the model runs. The difference with TurboQuant is that it reduces the KV cache’s memory in real time ‪—‬ a tricky feat given that it must keep the quantized data in the cache accurate and up-to-date while the model generates outputs.

Get the world’s most fascinating discoveries delivered straight to your inbox.

In a statement, Google representatives said TurboQuant “showed great promise for reducing key-value bottlenecks without sacrificing AI model performance” in tests in Meta’s Llama 3.1-8B, Google’s Gemma and Mistral AI models.

“This has potentially profound implications for all compression-reliant use cases, including and especially in the domains of search and AI,” they added.

Is this Google’s “DeepSeek moment”?

Google says TurboQuant could reduce the KV cache’s size by a factor of at least six times, using two methods: PolarQuant and Quantized Johnson-Lindenstrauss (QJL).


What to read next

To interpret these methods, it is important to understand that data in the AI’s working memory has been turned into vectors — groups of numbers that have a defined size (radius) and direction (angle). Vectors can be mathematically “rotated,” meaning they are reexpressed in a different, common coordinate system.

PolarQuant quantization reexpresses AI data from Cartesian coordinates (along X, Y and Z axes) into polar coordinates (angles around a single point). The rotation aligns the angles of the vectors more consistently, thereby allowing them to be compressed into fewer bits with less additional scaling information. The vectors then go through the QJL optimization method, where they are adjusted very slightly to correct any computational errors stemming from the quantization.

In a post on the social media platform X, Matthew Prince, CEO of web security company Cloudflare, called the compression breakthrough “Google’s DeepSeek” ‪—‬ a reference to the surprise release of the Chinese firm’s AI model that achieved comparable results to leading chatbots at a fraction of the cost.

Google’s March 24 unveiling of TurboQuant sent stocks in memory companies like SanDisk, Western Digital and Seagate plummeting. But although the discovery could prove pivotal in improving AI efficiency, it is still at the lab stage and has yet to be widely rolled out in real-world models.

Moreover, it will compress only the working memory used during inference. This is when it is generating a response to a prompt. A model’s training typically requires up to four times more memory than that, so the actual impact on memory will be relatively small.

This is what Merrill Lynch banker Vivek Arya explained to concerned investors in a note, according to ZDNet: “(The) 6x improvement in memory efficiency [will] likely [lead] to 6x increase in accuracy (model size) and/or context length (KV cache allocation), rather than 6x decrease in memory.”

Google officially unveiled TurboQuant at ICLR 2026, which took place April 23-27 in Rio de Janeiro, and will formally present PolarQuant and QJL at AISTATS 2026 in Tangier, Morocco, in early May.

Share. Facebook Twitter LinkedIn Telegram WhatsApp Email

Keep Reading

‘The detectors never stopped beeping!’ Nearly 3,000 coins discovered in field are Norway’s largest Viking hoard on record

‘The detectors never stopped beeping!’ Nearly 3,000 coins discovered in field are Norway’s largest Viking hoard on record

Doctors partially delivered a baby at 25 weeks to perform a lifesaving surgery and then returned him to the womb

Doctors partially delivered a baby at 25 weeks to perform a lifesaving surgery and then returned him to the womb

Used SpaceX rocket stage could hit the moon’s Einstein crater this summer, report finds

Used SpaceX rocket stage could hit the moon’s Einstein crater this summer, report finds

Mount Etna is like no other volcano on Earth, representing ‘a new type of volcanism,’ new research reveals

Mount Etna is like no other volcano on Earth, representing ‘a new type of volcanism,’ new research reveals

Can NASA and SpaceX really build a moon base in the next 10 years?

Can NASA and SpaceX really build a moon base in the next 10 years?

Does Wegovy carry a risk of ‘eye stroke’ and vision loss? Here’s what the data says.

Does Wegovy carry a risk of ‘eye stroke’ and vision loss? Here’s what the data says.

‘We can no longer ignore diseases in the deep human past’: Malaria influenced early humans’ migrations across Africa, study suggests

‘We can no longer ignore diseases in the deep human past’: Malaria influenced early humans’ migrations across Africa, study suggests

Heartbeats physically stop cardiac cancer from growing — hinting that ‘squeezing’ tumors could be a good way to thwart them

Heartbeats physically stop cardiac cancer from growing — hinting that ‘squeezing’ tumors could be a good way to thwart them

Runners have finally completed a sub 2-hour marathon, but another running world record was recently smashed — this time by a humanoid robot. Here’s how.

Runners have finally completed a sub 2-hour marathon, but another running world record was recently smashed — this time by a humanoid robot. Here’s how.

Add A Comment
Leave A Reply Cancel Reply

Editors Picks

Vice President JD Vance Says WHCD Shooting Was ‘Tougher’ on His Pregnant Wife Usha Who Was at Home

Vice President JD Vance Says WHCD Shooting Was ‘Tougher’ on His Pregnant Wife Usha Who Was at Home

April 30, 2026
Don’t be fooled by Dodgers B offense… they’re not worth that

Don’t be fooled by Dodgers $1B offense… they’re not worth that

April 30, 2026
‘The detectors never stopped beeping!’ Nearly 3,000 coins discovered in field are Norway’s largest Viking hoard on record

‘The detectors never stopped beeping!’ Nearly 3,000 coins discovered in field are Norway’s largest Viking hoard on record

April 30, 2026
I’m a doctor — women need to stop ‘white-knuckling’ through a common inflammatory condition

I’m a doctor — women need to stop ‘white-knuckling’ through a common inflammatory condition

April 30, 2026

Subscribe to News

Get the latest USA news and updates directly to your inbox.

Latest News
Biden admin ‘zealously’ probed ‘traditional’ Christians — even keeping tabs on priests: DOJ report

Biden admin ‘zealously’ probed ‘traditional’ Christians — even keeping tabs on priests: DOJ report

April 30, 2026
Tony Romo misses US Open qualifier after shooting 9-over 79 in Texas

Tony Romo misses US Open qualifier after shooting 9-over 79 in Texas

April 30, 2026
Jessie James Decker’s style secrets, from sexy camisoles to chic shades

Jessie James Decker’s style secrets, from sexy camisoles to chic shades

April 30, 2026
Facebook X (Twitter) Pinterest WhatsApp TikTok Instagram
© 2026 USA Times. All Rights Reserved.
  • Privacy Policy
  • Terms
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.