Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Infoway pulls the plug on PrescribeIT

February 22, 2026

Fei-Fei Li’s World Labs raises $1bn to advance spatial intelligence

February 22, 2026

Arts, crafts, group: How Sunday Soul Sante sustains city creativity

February 22, 2026
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Infoway pulls the plug on PrescribeIT
  • Fei-Fei Li’s World Labs raises $1bn to advance spatial intelligence
  • Arts, crafts, group: How Sunday Soul Sante sustains city creativity
  • Metilis Gravity Furnace Is perhaps the Coolest Robotic Watch Winder Ever
  • Ramez Galal’s Prank Present Slammed for Sexualizing, Humiliating Egyptian Girls
  • Apple to open Dublin metropolis workplace with 300-strong staff deliberate
  • Meta stole Sarah Wynn-Williams’s voice. It couldn’t cease her exposé
  • A New Google AI Analysis Proposes Deep-Considering Ratio to Enhance LLM Accuracy Whereas Reducing Whole Inference Prices by Half
Sunday, February 22
NextTech NewsNextTech News
Home - AI & Machine Learning - A New Google AI Analysis Proposes Deep-Considering Ratio to Enhance LLM Accuracy Whereas Reducing Whole Inference Prices by Half
AI & Machine Learning

A New Google AI Analysis Proposes Deep-Considering Ratio to Enhance LLM Accuracy Whereas Reducing Whole Inference Prices by Half

NextTechBy NextTechFebruary 22, 2026No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
A New Google AI Analysis Proposes Deep-Considering Ratio to Enhance LLM Accuracy Whereas Reducing Whole Inference Prices by Half
Share
Facebook Twitter LinkedIn Pinterest Email


For the previous few years, the AI world has adopted a easy rule: in order for you a Massive Language Mannequin (LLM) to unravel a more durable drawback, make its Chain-of-Thought (CoT) longer. However new analysis from the College of Virginia and Google proves that ‘pondering lengthy’ will not be the identical as ‘pondering arduous’.

The analysis group reveals that merely including extra tokens to a response can really make an AI much less correct. As an alternative of counting phrases, the Google researchers introduce a brand new measurement: the Deep-Considering Ratio (DTR).

Screenshot 2026 02 21 at 8.52.04 PM 1
https://arxiv.org/pdf/2602.13517

The Failure of ‘Token Maxing‘

Engineers usually use token depend as a proxy for the hassle an AI places right into a activity. Nonetheless, the researchers discovered that uncooked token depend has a mean correlation of r= -0.59 with accuracy.

This destructive quantity signifies that because the mannequin generates extra textual content, it’s extra more likely to be improper. This occurs due to ‘overthinking,’ the place the mannequin will get caught in loops, repeats redundant steps, or amplifies its personal errors. Counting on size alone wastes costly compute on uninformative tokens.

What are Deep-Considering Tokens?

The analysis group argued that actual ‘pondering’ occurs contained in the layers of the mannequin, not simply within the last output. When a mannequin predicts a token, it processes knowledge by way of a sequence of transformer layers (L).

  1. Shallow Tokens: For simple phrases, the mannequin’s prediction stabilizes early. The ‘guess’ doesn’t change a lot from layer 5 to layer 36.
  2. Deep-Considering Tokens: For troublesome logic or math symbols, the prediction shifts considerably within the deeper layers.

Methods to Measure Depth

To establish these tokens, the analysis group makes use of a way to peek on the mannequin’s inside ‘drafts’ at each layer. They mission the intermediate hidden states (htl) into the vocabulary house utilizing the mannequin’s unembedding matrix (WU). This produces a chance distribution (pt,l) for each layer.

They then calculate the Jensen-Shannon Divergence (JSD) between the intermediate layer distribution and the ultimate layer distribution (pt,L):

Dt,l := JSD(pt,L || pt,l)

A token is a deep-thinking token if its prediction solely settles within the ‘late regime’—outlined by a depth fraction (⍴). Of their assessments, they set ⍴= 0.85, which means the token solely stabilized within the last 15% of the layers.

The Deep-Considering Ratio (DTR) is the share of those ‘arduous’ tokens in a full sequence. Throughout fashions like DeepSeek-R1-70B, Qwen3-30B-Considering, and GPT-OSS-120B, DTR confirmed a powerful common optimistic correlation of r = 0.683 with accuracy.

Screenshot 2026 02 21 at 8.52.35 PM 1Screenshot 2026 02 21 at 8.52.35 PM 1
https://arxiv.org/pdf/2602.13517

Suppose@n: Higher Accuracy at 50% the Price

The analysis group used this modern strategy to create Suppose@n, a brand new strategy to scale AI efficiency throughout inference.

Most devs use Self-Consistency (Cons@n), the place they pattern 48 completely different solutions and use majority voting to select the most effective one. That is very costly as a result of you must generate each single token for each reply.

Suppose@n adjustments the sport through the use of ‘early halting’:

  • The mannequin begins producing a number of candidate solutions.
  • After simply 50 prefix tokens, the system calculates the DTR for every candidate.
  • It instantly stops producing the ‘unpromising’ candidates with low DTR.
  • It solely finishes the candidates with excessive deep-thinking scores.

The Outcomes on AIME 2025

Methodology Accuracy Avg. Price (okay tokens)
Cons@n (Majority Vote) 92.7% 307.6
Suppose@n (DTR-based Choice) 94.7% 155.4

On the AIME 25 math benchmark, Suppose@n achieved greater accuracy than customary voting whereas lowering the inference value by 49%.

Key Takeaways

  • Token depend is a poor predictor of accuracy: Uncooked output size has a mean destructive correlation (r = -0.59) with efficiency, which means longer reasoning traces usually sign ‘overthinking’ fairly than greater high quality.
  • Deep-thinking tokens outline true effort: In contrast to easy tokens that stabilize in early layers, deep-thinking tokens are these whose inside predictions bear important revision in deeper mannequin layers earlier than converging.
  • The Deep-Considering Ratio (DTR) is a superior metric: DTR measures the proportion of deep-thinking tokens in a sequence and reveals a sturdy optimistic correlation with accuracy (common r = 0.683), constantly outperforming length-based or confidence-based baselines.
  • Suppose@n permits environment friendly test-time scaling: By prioritizing and ending solely the samples with excessive deep-thinking ratios, the Suppose@n technique matches or exceeds the efficiency of normal majority voting (Cons@n).
  • Huge value discount through early halting: As a result of DTR could be estimated from a brief prefix of simply 50 tokens, unpromising generations could be rejected early, lowering whole inference prices by roughly 50%.

Try the Paper. Additionally, be at liberty to comply with us on Twitter and don’t overlook to hitch our 100k+ ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you may be part of us on telegram as nicely.


NVIDIA 1

Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the most recent breakthroughs, get unique updates, and join with a worldwide community of future-focused thinkers.
Unlock tomorrow’s developments right now: learn extra, subscribe to our publication, and turn into a part of the NextTech neighborhood at NextTech-news.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

Learn how to Design an Agentic Workflow for Instrument-Pushed Route Optimization with Deterministic Computation and Structured Outputs

February 22, 2026

Is There a Group Version of Palantir? Meet OpenPlanter: An Open Supply Recursive AI Agent for Your Micro Surveillance Use Instances

February 21, 2026

A Coding Information to Excessive-High quality Picture Era, Management, and Enhancing Utilizing HuggingFace Diffusers

February 21, 2026
Add A Comment
Leave A Reply Cancel Reply

Economy News

Infoway pulls the plug on PrescribeIT

By NextTechFebruary 22, 2026

OTTAWA – Canada Well being Infoway introduced that PrescribeIT, its program for e-prescribing amongst physicians…

Fei-Fei Li’s World Labs raises $1bn to advance spatial intelligence

February 22, 2026

Arts, crafts, group: How Sunday Soul Sante sustains city creativity

February 22, 2026
Top Trending

Infoway pulls the plug on PrescribeIT

By NextTechFebruary 22, 2026

OTTAWA – Canada Well being Infoway introduced that PrescribeIT, its program for…

Fei-Fei Li’s World Labs raises $1bn to advance spatial intelligence

By NextTechFebruary 22, 2026

The spherical was backed by massive names together with Nvidia, AMD and…

Arts, crafts, group: How Sunday Soul Sante sustains city creativity

By NextTechFebruary 22, 2026

Launched in 2014, PhotoSparks is a weekly function from YourStory, with pictures that remember the…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!