Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Amazon lays off one other 16,000 jobs in second spherical cuts

January 29, 2026

Shopping for an Present Enterprise vs. Ranging from Scratch: Which Wins?

January 29, 2026

OpenAI Rolls Out Free Science Platform Prism as Specialists Warn of Privateness Considerations

January 29, 2026
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Amazon lays off one other 16,000 jobs in second spherical cuts
  • Shopping for an Present Enterprise vs. Ranging from Scratch: Which Wins?
  • OpenAI Rolls Out Free Science Platform Prism as Specialists Warn of Privateness Considerations
  • Eikon units $274M purpose for upcoming inventory market debut
  • Google, Nvidia again ex-Tesla CTO’s battery startup in $425m spherical
  • Coca-Cola Captures ‘All of the Feels’ of Soccer Followers for World Cup 2026
  • Amazon cuts 16,000 jobs globally as AI spending rises
  • Inside Banda’s synthetic flower unit formed by ladies working from dwelling
Thursday, January 29
NextTech NewsNextTech News
Home - AI & Machine Learning - MBZUAI Releases K2 Suppose V2: A Absolutely Sovereign 70B Reasoning Mannequin For Math, Code, And Science
AI & Machine Learning

MBZUAI Releases K2 Suppose V2: A Absolutely Sovereign 70B Reasoning Mannequin For Math, Code, And Science

NextTechBy NextTechJanuary 28, 2026No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
MBZUAI Releases K2 Suppose V2: A Absolutely Sovereign 70B Reasoning Mannequin For Math, Code, And Science
Share
Facebook Twitter LinkedIn Pinterest Email


Can a completely sovereign open reasoning mannequin match cutting-edge programs when each a part of its coaching pipeline is clear. Researchers from Mohamed bin Zayed College of Synthetic Intelligence (MBZUAI) launch K2 Suppose V2, a completely sovereign reasoning mannequin designed to check how far open and totally documented pipelines can push lengthy horizon reasoning on math, code, and science when all the stack is open and reproducible. K2 Suppose V2 takes the 70 billion parameter K2 V2 Instruct base mannequin and applies a fastidiously engineered reinforcement studying method to show it right into a excessive precision reasoning mannequin that is still totally open in each weights and knowledge.

Screenshot 2026 01 28 at 1.12.16 PM 1
https://arxiv.org/pdf/2512.06201

From K2 V2 base mannequin to reasoning specialist

K2 V2 is a dense decoder solely transformer with 80 layers, hidden measurement 8192, and 64 consideration heads with grouped question consideration and rotary place embeddings. It’s skilled on round 12 trillion tokens drawn from the TxT360 corpus and associated curated datasets that cowl internet textual content, math, code, multilingual knowledge, and scientific literature.

Coaching proceeds in three phases. Pretraining runs at context size 8192 tokens on pure knowledge to determine strong common data. Mid coaching then extends context as much as 512k tokens utilizing TxT360 Midas, which mixes lengthy paperwork, artificial considering traces, and numerous reasoning behaviors whereas fastidiously retaining not less than 30 % quick context knowledge in each stage. Lastly, supervised tremendous tuning, referred to as TxT360 3efforts, injects instruction following and structured reasoning indicators.

The vital level is that K2 V2 shouldn’t be a generic base mannequin. It’s explicitly optimized for lengthy context consistency and publicity to reasoning behaviors throughout mid coaching. That makes it a pure basis for a publish coaching stage that focuses solely on reasoning high quality, which is strictly what K2 Suppose V2 does.

Absolutely sovereign RLVR on GURU dataset

K2 Suppose V2 is skilled with a GRPO type RLVR recipe on high of K2 V2 Instruct. The crew makes use of the Guru dataset, model 1.5, which focuses on math, code, and STEM questions. Guru is derived from permissively licensed sources, expanded in STEM protection, and decontaminated in opposition to key analysis benchmarks earlier than use. That is vital for a sovereign declare, as a result of each the bottom mannequin knowledge and the RL knowledge are curated and documented by the identical institute.

The GRPO setup removes the same old KL and entropy auxiliary losses and makes use of uneven clipping of the coverage ratio with the excessive clip set to 0.28. Coaching runs totally on coverage with temperature 1.2 to extend rollout variety, world batch measurement 256, and no micro batching. This avoids off coverage corrections which are recognized to introduce instability in GRPO like coaching.

RLVR itself runs in two levels. Within the first stage, response size is capped at 32k tokens and the mannequin trains for about 200 steps. Within the second stage, the utmost response size is elevated to 64k tokens and coaching continues for about 50 steps with the identical hyperparameters. This schedule particularly exploits the lengthy context functionality inherited from K2 V2 in order that the mannequin can apply full chain of thought trajectories somewhat than quick options.

Screenshot 2026 01 28 at 1.13.51 PM 1Screenshot 2026 01 28 at 1.13.51 PM 1
https://mbzuai.ac.ae/information/k2-think-v2-a-fully-sovereign-reasoning-model/

Benchmark profile

K2 Suppose V2 targets reasoning benchmarks somewhat than purely data benchmarks. On AIME 2025 it reaches go at 1 of 90.42. On HMMT 2025 it scores 84.79. On GPQA Diamond, a tough graduate stage science benchmark, it reaches 72.98. On SciCode it information 33.00, and on Humanity’s Final Examination it reaches 9.5 underneath the benchmark settings.

These scores are reported as averages over 16 runs and are instantly comparable solely throughout the identical analysis protocol. The MBZUAI crew additionally highlights enhancements on IFBench and on the Synthetic Evaluation analysis suite, with specific features in hallucination charge and lengthy context reasoning in contrast with the earlier K2 Suppose launch.

Security and openness

The analysis crew stories a Security 4 type evaluation that aggregates 4 security surfaces. Content material and public security, truthfulness and reliability, and societal alignment all attain macro common threat ranges within the low vary. Information and infrastructure dangers stay increased and are marked as essential, which displays issues about delicate private data dealing with somewhat than mannequin conduct alone. The crew states that K2 Suppose V2 nonetheless shares the generic limitations of enormous language fashions regardless of these mitigations. On Synthetic Evaluation’s Openness Index, K2 Suppose V2 sits on the frontier along with K2 V2 and Olmo-3.

Key Takeaways

  • K2 Suppose V2 is a completely sovereign 70B reasoning mannequin: Constructed on K2 V2 Instruct, with open weights, open knowledge recipes, detailed coaching logs, and full RL pipeline launched by way of Reasoning360.
  • Base mannequin is optimized for lengthy context and reasoning earlier than RL: K2 V2 is a dense decoder transformer skilled on round 12T tokens, with mid coaching extending context size to 512K tokens and supervised ‘3 efforts’ SFT concentrating on structured reasoning.
  • Reasoning is aligned utilizing GRPO based mostly RLVR on the Guru dataset: Coaching makes use of a 2 stage on coverage GRPO setup on Guru v1.5, with uneven clipping, temperature 1.2, and response caps at 32K then 64K tokens to study lengthy chain of thought options.
  • Aggressive outcomes on arduous reasoning benchmarks: K2 Suppose V2 stories sturdy go at 1 scores comparable to 90.42 on AIME 2025, 84.79 on HMMT 2025, and 72.98 on GPQA Diamond, positioning it as a excessive precision open reasoning mannequin for math, code, and science.

Take a look at the Paper, Mannequin Weight, Repo and Technical particulars. Additionally, be happy to observe us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you may be part of us on telegram as properly.


1751691047774

Max is an AI analyst at MarkTechPost, based mostly in Silicon Valley, who actively shapes the way forward for expertise. He teaches robotics at Brainvyne, combats spam with ComplyEmail, and leverages AI every day to translate complicated tech developments into clear, comprehensible insights

Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the most recent breakthroughs, get unique updates, and join with a world community of future-focused thinkers.
Unlock tomorrow’s developments as we speak: learn extra, subscribe to our publication, and develop into a part of the NextTech group at NextTech-news.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

Tencent Hunyuan Releases HPC-Ops: A Excessive Efficiency LLM Inference Operator Library

January 28, 2026

Moonshot AI Releases Kimi K2.5: An Open Supply Visible Agentic Intelligence Mannequin with Native Swarm Execution

January 28, 2026

How Tree-KG Allows Hierarchical Information Graphs for Contextual Navigation and Explainable Multi-Hop Reasoning Past Conventional RAG

January 27, 2026
Add A Comment
Leave A Reply Cancel Reply

Economy News

Amazon lays off one other 16,000 jobs in second spherical cuts

By NextTechJanuary 29, 2026

Amazon is chopping 16,000 company jobs in a second spherical of mass layoffs, which the…

Shopping for an Present Enterprise vs. Ranging from Scratch: Which Wins?

January 29, 2026

OpenAI Rolls Out Free Science Platform Prism as Specialists Warn of Privateness Considerations

January 29, 2026
Top Trending

Amazon lays off one other 16,000 jobs in second spherical cuts

By NextTechJanuary 29, 2026

Amazon is chopping 16,000 company jobs in a second spherical of mass…

Shopping for an Present Enterprise vs. Ranging from Scratch: Which Wins?

By NextTechJanuary 29, 2026

Crusing into entrepreneurial waters is one thing that many dream of. In…

OpenAI Rolls Out Free Science Platform Prism as Specialists Warn of Privateness Considerations

By NextTechJanuary 29, 2026

In short OpenAI launched Prism, a free LaTeX-based analysis platform with GPT-5.2…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!