Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Immersive Innovation Meets Cultural Heritage: Korean Startups Lead the Subsequent Wave of Artwork-Tech at SIGGRAPH Asia 2025 – KoreaTechDesk

December 27, 2025

JB Monetary, Naver Cloud Check AI Use in Lending Below Threat-Management Framework

December 27, 2025

Prosperous Journey within the UAE Is Reshaping the Way forward for Luxurious Mobility

December 27, 2025
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Immersive Innovation Meets Cultural Heritage: Korean Startups Lead the Subsequent Wave of Artwork-Tech at SIGGRAPH Asia 2025 – KoreaTechDesk
  • JB Monetary, Naver Cloud Check AI Use in Lending Below Threat-Management Framework
  • Prosperous Journey within the UAE Is Reshaping the Way forward for Luxurious Mobility
  • Know-how issues, however what issues extra is how we use it: MICA Director Jaya Deshmukh
  • CarDekho invests $10M in CollegeDekho
  • MassRobotics Launches the AMD Robotics Innovation Problem, Leveraging Adaptive Computing for Edge Robotics Functions
  • The 12 largest area tales of 2025 — in line with you
  • The Position of Attorneys in Guaranteeing Pedestrian Security: What You Must Know
Saturday, December 27
NextTech NewsNextTech News
Home - Space & Deep Tech - QwenLong-L1 solves long-context reasoning problem that stumps present LLMs
Space & Deep Tech

QwenLong-L1 solves long-context reasoning problem that stumps present LLMs

NextTechBy NextTechJune 1, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
QwenLong-L1 solves long-context reasoning problem that stumps present LLMs
Share
Facebook Twitter LinkedIn Pinterest Email

Be a part of our each day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Be taught Extra


Alibaba Group has launched QwenLong-L1, a brand new framework that permits massive language fashions (LLMs) to motive over extraordinarily lengthy inputs. This growth may unlock a brand new wave of enterprise functions that require fashions to grasp and draw insights from intensive paperwork corresponding to detailed company filings, prolonged monetary statements, or complicated authorized contracts.

The problem of long-form reasoning for AI

Current advances in massive reasoning fashions (LRMs), notably via reinforcement studying (RL), have considerably improved their problem-solving capabilities. Analysis exhibits that when educated with RL fine-tuning, LRMs purchase abilities much like human “sluggish considering,” the place they develop refined methods to deal with complicated duties.

Nonetheless, these enhancements are primarily seen when fashions work with comparatively brief items of textual content, sometimes round 4,000 tokens. The flexibility of those fashions to scale their reasoning to for much longer contexts (e.g., 120,000 tokens) stays a significant problem. Such long-form reasoning requires a strong understanding of all the context and the power to carry out multi-step evaluation. “This limitation poses a big barrier to sensible functions requiring interplay with exterior information, corresponding to deep analysis, the place LRMs should gather and course of data from knowledge-intensive environments,” the builders of QwenLong-L1 write of their paper.

The researchers formalize these challenges into the idea of “long-context reasoning RL.” In contrast to short-context reasoning, which regularly depends on information already saved inside the mannequin, long-context reasoning RL requires fashions to retrieve and floor related data from prolonged inputs precisely. Solely then can they generate chains of reasoning based mostly on this included data. 

Coaching fashions for this via RL is hard and infrequently leads to inefficient studying and unstable optimization processes. Fashions battle to converge on good options or lose their skill to discover numerous reasoning paths.

QwenLong-L1: A multi-stage method

QwenLong-L1 is a reinforcement studying framework designed to assist LRMs transition from proficiency with brief texts to sturdy generalization throughout lengthy contexts. The framework enhances current short-context LRMs via a rigorously structured, multi-stage course of:

Heat-up Supervised Advantageous-Tuning (SFT): The mannequin first undergoes an SFT section, the place it’s educated on examples of long-context reasoning. This stage establishes a stable basis, enabling the mannequin to floor data precisely from lengthy inputs. It helps develop elementary capabilities in understanding context, producing logical reasoning chains, and extracting solutions.

Curriculum-Guided Phased RL: At this stage, the mannequin is educated via a number of phases, with the goal size of the enter paperwork progressively growing. This systematic, step-by-step method helps the mannequin stably adapt its reasoning methods from shorter to progressively longer contexts. It avoids the instability usually seen when fashions are abruptly educated on very lengthy texts.

Issue-Conscious Retrospective Sampling: The ultimate coaching stage incorporates difficult examples from the previous coaching phases, guaranteeing the mannequin continues to be taught from the toughest issues. This prioritizes troublesome cases and encourages the mannequin to discover extra numerous and complicated reasoning paths.

QwenLong-L1 course of Supply: arXiv

Past this structured coaching, QwenLong-L1 additionally makes use of a definite reward system. Whereas coaching for short-context reasoning duties usually depends on strict rule-based rewards (e.g., an accurate reply in a math downside), QwenLong-L1 employs a hybrid reward mechanism. This combines rule-based verification, which ensures precision by checking for strict adherence to correctness standards, with an “LLM-as-a-judge.” This decide mannequin compares the semanticity of the generated reply with the bottom fact, permitting for extra flexibility and higher dealing with of the various methods appropriate solutions may be expressed when coping with lengthy, nuanced paperwork.

Placing QwenLong-L1 to the take a look at

The Alibaba crew evaluated QwenLong-L1 utilizing doc question-answering (DocQA) as the first activity. This situation is extremely related to enterprise wants, the place AI should perceive dense paperwork to reply complicated questions. 

Experimental outcomes throughout seven long-context DocQA benchmarks confirmed QwenLong-L1’s capabilities. Notably, the QWENLONG-L1-32B mannequin (based mostly on DeepSeek-R1-Distill-Qwen-32B) achieved efficiency corresponding to Anthropic’s Claude-3.7 Sonnet Pondering, and outperformed fashions like OpenAI’s o3-mini and Qwen3-235B-A22B. The smaller QWENLONG-L1-14B mannequin additionally outperformed Google’s Gemini 2.0 Flash Pondering and Qwen3-32B. 

Source: arXiv
Supply: arXiv

An vital discovering related to real-world functions is how RL coaching leads to the mannequin growing specialised long-context reasoning behaviors. The paper notes that fashions educated with QwenLong-L1 develop into higher at “grounding” (linking solutions to particular components of a doc), “subgoal setting” (breaking down complicated questions), “backtracking” (recognizing and correcting their very own errors mid-reasoning), and “verification” (double-checking their solutions).

As an example, whereas a base mannequin may get sidetracked by irrelevant particulars in a monetary doc or get caught in a loop of over-analyzing unrelated data, the QwenLong-L1 educated mannequin demonstrated a capability to have interaction in efficient self-reflection. It may efficiently filter out these distractor particulars, backtrack from incorrect paths, and arrive on the appropriate reply.

Methods like QwenLong-L1 may considerably increase the utility of AI within the enterprise. Potential functions embrace authorized tech (analyzing 1000’s of pages of authorized paperwork), finance (deep analysis on annual experiences and monetary filings for danger evaluation or funding alternatives) and customer support (analyzing lengthy buyer interplay histories to supply extra knowledgeable help). The researchers have launched the code for the QwenLong-L1 recipe and the weights for the educated fashions.

Every day insights on enterprise use instances with VB Every day

If you wish to impress your boss, VB Every day has you lined. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for optimum ROI.

Learn our Privateness Coverage

Thanks for subscribing. Try extra VB newsletters right here.

An error occured.

vb daily phone


Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

The 12 largest area tales of 2025 — in line with you

December 27, 2025

NASA Examine Suggests Saturn’s Moon Titan Might Not Have International Ocean

December 27, 2025

8 methods to get extra iPhone storage right now – and most are free

December 26, 2025
Add A Comment
Leave A Reply Cancel Reply

Economy News

Immersive Innovation Meets Cultural Heritage: Korean Startups Lead the Subsequent Wave of Artwork-Tech at SIGGRAPH Asia 2025 – KoreaTechDesk

By NextTechDecember 27, 2025

The way forward for Korea’s artistic financial system isn’t unfolding in studios or live performance…

JB Monetary, Naver Cloud Check AI Use in Lending Below Threat-Management Framework

December 27, 2025

Prosperous Journey within the UAE Is Reshaping the Way forward for Luxurious Mobility

December 27, 2025
Top Trending

Immersive Innovation Meets Cultural Heritage: Korean Startups Lead the Subsequent Wave of Artwork-Tech at SIGGRAPH Asia 2025 – KoreaTechDesk

By NextTechDecember 27, 2025

The way forward for Korea’s artistic financial system isn’t unfolding in studios…

JB Monetary, Naver Cloud Check AI Use in Lending Below Threat-Management Framework

By NextTechDecember 27, 2025

Partnership focuses on credit score overview, transparency, and phased deployment quite than…

Prosperous Journey within the UAE Is Reshaping the Way forward for Luxurious Mobility

By NextTechDecember 27, 2025

Taylor Journey Administration Group sees human-led, bespoke mobility turn into a core…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!