Close Menu
  • Home
  • Opinion
  • Region
    • Africa
    • Asia
    • Europe
    • Middle East
    • North America
    • Oceania
    • South America
  • AI & Machine Learning
  • Robotics & Automation
  • Space & Deep Tech
  • Web3 & Digital Economies
  • Climate & Sustainability Tech
  • Biotech & Future Health
  • Mobility & Smart Cities
  • Global Tech Pulse
  • Cybersecurity & Digital Rights
  • Future of Work & Education
  • Trend Radar & Startup Watch
  • Creator Economy & Culture
What's Hot

Honda dumps Zero Sequence EVs amid ‘extraordinarily difficult’ state of affairs

March 13, 2026

The best way to Construct an Autonomous Machine Studying Analysis Loop in Google Colab Utilizing Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Monitoring

March 13, 2026

Tesla Mannequin Y L orders open in Australia, 6 seats, Car-to-Load and 681 km vary

March 13, 2026
Facebook X (Twitter) Instagram LinkedIn RSS
NextTech NewsNextTech News
Facebook X (Twitter) Instagram LinkedIn RSS
  • Home
  • Africa
  • Asia
  • Europe
  • Middle East
  • North America
  • Oceania
  • South America
  • Opinion
Trending
  • Honda dumps Zero Sequence EVs amid ‘extraordinarily difficult’ state of affairs
  • The best way to Construct an Autonomous Machine Studying Analysis Loop in Google Colab Utilizing Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Monitoring
  • Tesla Mannequin Y L orders open in Australia, 6 seats, Car-to-Load and 681 km vary
  • Inside Lalitpur’s Hosiery Merchandise Manufacturing Commerce
  • MacBook Neo Teardown Reveals Apple’s Best Laptop computer to Repair in Years
  • Apple might lastly excellent UI for foldables with the ‘iPhone Fold’
  • Stanford Researchers Launch OpenJarvis: A Native-First Framework for Constructing On-Machine Private AI Brokers with Instruments, Reminiscence, and Studying
  • CarbonSeeker’s Shopper Steady-Fiber 3D Printer Raises $4.7M on Kickstarter
Friday, March 13
NextTech NewsNextTech News
Home - AI & Machine Learning - OpenAI Introduces GPT 5.2: A Lengthy Context Workhorse For Brokers, Coding And Information Work
AI & Machine Learning

OpenAI Introduces GPT 5.2: A Lengthy Context Workhorse For Brokers, Coding And Information Work

NextTechBy NextTechDecember 11, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Telegram Email Copy Link
Follow Us
Google News Flipboard
OpenAI Introduces GPT 5.2: A Lengthy Context Workhorse For Brokers, Coding And Information Work
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI has simply launched GPT-5.2, its most superior frontier mannequin for skilled work and lengthy operating brokers, and is rolling it out throughout ChatGPT and the API.

GPT-5.2 is a household of three variants. In ChatGPT, customers see ChatGPT-5.2 Instantaneous, Considering and Professional. Within the API, the corresponding fashions are gpt-5.2-chat-latest, gpt-5.2, and gpt-5.2-pro. Instantaneous targets on a regular basis help and studying, Considering targets advanced multi step work and brokers, and Professional allocates extra compute for onerous technical and analytical duties.

Benchmark profile, from GDPval to SWE Bench

GPT-5.2 Considering is positioned as the principle workhorse for actual world information work. On GDPval, an analysis of effectively specified information duties throughout 44 occupations in 9 giant industries, it beats or ties high business professionals on 70.9 % of comparisons, whereas producing outputs at greater than 11 instances the pace and underneath 1 % of the estimated knowledgeable value. For engineering groups this implies the mannequin can reliably generate artifacts comparable to displays, spreadsheets, schedules, and diagrams given structured directions.

On an inside benchmark of junior funding banking spreadsheet modeling duties, common scores rise from 59.1 % with GPT-5.1 to 68.4 % with GPT-5.2 Considering and 71.7 % with GPT-5.2 Professional. These duties embody three assertion fashions and leveraged buyout fashions with constraints on formatting and citations, which is consultant of many structured enterprise workflows.

In software program engineering, GPT-5.2 Considering reaches 55.6 % on SWE-Bench Professional and 80.0 % on SWE-bench Verified. SWE-Bench Professional evaluates repository degree patch era over a number of languages, whereas SWE-bench Verified focuses on Python.

Lengthy context and agentic workflows

Lengthy context is a core design goal. GPT-5.2 Considering units a brand new cutting-edge on OpenAI MRCRv2, a benchmark that inserts a number of equivalent ‘needle’ queries into lengthy dialogue “haystacks” and measures whether or not the mannequin can reproduce the proper reply. It’s the first mannequin reported to succeed in close to one hundred pc accuracy on the 4 needle MRCR variant out to 256k tokens.

For workloads that exceed even that context, GPT-5.2 Considering integrates with the Responses /compact endpoint, which performs context compaction to increase the efficient window for software heavy, lengthy operating jobs. That is related if you’re constructing brokers that iteratively name instruments over many steps and wish to take care of state past the uncooked token restrict.

On software utilization, GPT-5.2 Considering reaches 98.7 % on Tau2-bench Telecom, a multi flip buyer help benchmark the place the mannequin should orchestrate software calls throughout a sensible workflow. The official examples from OpenAI launch submit present situations like a traveler with a delayed flight, missed connection, misplaced bag and medical seating requirement, the place GPT-5.2 manages rebooking, particular help seating and compensation in a constant sequence whereas GPT-5.1 leaves steps unfinished.

Imaginative and prescient, science and math

Imaginative and prescient high quality additionally strikes up. GPT-5.2 Considering roughly halves error charges on chart reasoning and person interface understanding benchmarks like CharXiv Reasoning and ScreenSpot Professional when a Python software is enabled. The mannequin reveals improved spatial understanding of photographs, for instance when labeling motherboard parts with approximate bounding containers, GPT-5.2 identifies extra areas with tighter placement than GPT-5.1.

For scientific workloads, GPT-5.2 Professional scores 93.2 % and GPT-5.2 Considering 92.4 % on GPQA Diamond, and GPT-5.2 Considering solves 40.3 % of FrontierMath Tier 1 to Tier 3 issues with Python instruments enabled. These benchmarks cowl graduate degree physics, chemistry, biology and knowledgeable arithmetic, and OpenAI highlights early use the place GPT-5.2 Professional contributed to a proof in statistical studying principle underneath human verification.

Comparability Desk

Mannequin Main positioning Context window / max output Information cutoff Notable benchmarks (Considering / Professional vs GPT-5.1 Considering)
GPT-5.1 Flagship mannequin for coding and agentic duties with configurable reasoning effort 400,000 tokens context, 128,000 max output 2024-09-30 SWE-Bench Professional 50.8 %, SWE-bench Verified 76.3 %, ARC-AGI-1 72.8 %, ARC-AGI-2 17.6 %
GPT-5.2 (Considering) New flagship mannequin for coding and agentic duties throughout industries and for lengthy operating brokers 400,000 tokens context, 128,000 max output 2025-08-31 GDPval wins or ties 70.9 % vs business professionals, SWE-Bench Professional 55.6 %, SWE-bench Verified 80.0 %, ARC-AGI-1 86.2 %, ARC-AGI-2 52.9 %
GPT-5.2 Professional Larger compute model of GPT-5.2 for the toughest reasoning and scientific workloads, produces smarter and extra exact responses 400,000 tokens context, 128,000 max output 2025-08-31 GPQA Diamond 93.2 % vs 92.4 % for GPT-5.2 Considering and 88.1 % for GPT-5.1 Considering, ARC-AGI-1 90.5 % and ARC-AGI-2 54.2 %

Key Takeaways

  1. GPT-5.2 Considering is the brand new default workhorse mannequin: It replaces GPT-5.1 Considering as the principle mannequin for coding, information work and brokers, whereas protecting the identical 400k context and 128k max output, however with clearly larger benchmark efficiency throughout GDPval, SWE-Bench, ARC-AGI and scientific QA.
  2. Substantial accuracy leap over GPT-5.1 at related scale: On key benchmarks, GPT-5.2 Considering strikes from 50.8 % to 55.6 % on SWE-Bench Professional and from 76.3 % to 80.0 % on SWE-bench Verified, and from 72.8 % to 86.2 % on ARC-AGI-1 and from 17.6 % to 52.9 % on ARC-AGI-2, whereas protecting token limits comparable.
  3. GPT-5.2 Professional is focused at excessive finish reasoning and science: GPT-5.2 Professional is the next compute variant that primarily improves onerous reasoning and scientific duties, for instance reaching 93.2 % on GPQA Diamond versus 92.4 % for GPT-5.2 Considering and 88.1 % for GPT-5.1 Considering, and better scores on ARC-AGI tiers.


Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its reputation amongst audiences.

🙌 Observe MARKTECHPOST: Add us as a most popular supply on Google.

Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the most recent breakthroughs, get unique updates, and join with a worldwide community of future-focused thinkers.
Unlock tomorrow’s tendencies immediately: learn extra, subscribe to our e-newsletter, and change into a part of the NextTech neighborhood at NextTech-news.com

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
NextTech
  • Website

Related Posts

The best way to Construct an Autonomous Machine Studying Analysis Loop in Google Colab Utilizing Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Monitoring

March 13, 2026

Stanford Researchers Launch OpenJarvis: A Native-First Framework for Constructing On-Machine Private AI Brokers with Instruments, Reminiscence, and Studying

March 12, 2026

Find out how to Design a Streaming Determination Agent with Partial Reasoning, On-line Replanning, and Reactive Mid-Execution Adaptation in Dynamic Environments

March 12, 2026
Add A Comment
Leave A Reply Cancel Reply

Economy News

Honda dumps Zero Sequence EVs amid ‘extraordinarily difficult’ state of affairs

By NextTechMarch 13, 2026

Simply once I had my hopes up for an thrilling Honda EV, the corporate has…

The best way to Construct an Autonomous Machine Studying Analysis Loop in Google Colab Utilizing Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Monitoring

March 13, 2026

Tesla Mannequin Y L orders open in Australia, 6 seats, Car-to-Load and 681 km vary

March 13, 2026
Top Trending

Honda dumps Zero Sequence EVs amid ‘extraordinarily difficult’ state of affairs

By NextTechMarch 13, 2026

Simply once I had my hopes up for an thrilling Honda EV,…

The best way to Construct an Autonomous Machine Studying Analysis Loop in Google Colab Utilizing Andrej Karpathy’s AutoResearch Framework for Hyperparameter Discovery and Experiment Monitoring

By NextTechMarch 13, 2026

On this tutorial, we implement a Colab-ready model of the AutoResearch framework…

Tesla Mannequin Y L orders open in Australia, 6 seats, Car-to-Load and 681 km vary

By NextTechMarch 13, 2026

Tesla’s newest car, the Mannequin Y L has arrived in Australia and…

Subscribe to News

Get the latest sports news from NewsSite about world, sports and politics.

NEXTTECH-LOGO
Facebook X (Twitter) Instagram YouTube

AI & Machine Learning

Robotics & Automation

Space & Deep Tech

Web3 & Digital Economies

Climate & Sustainability Tech

Biotech & Future Health

Mobility & Smart Cities

Global Tech Pulse

Cybersecurity & Digital Rights

Future of Work & Education

Creator Economy & Culture

Trend Radar & Startup Watch

News By Region

Africa

Asia

Europe

Middle East

North America

Oceania

South America

2025 © NextTech-News. All Rights Reserved
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms Of Service
  • Advertise With Us
  • Write For Us
  • Submit Article & Press Release

Type above and press Enter to search. Press Esc to cancel.

Subscribe For Latest Updates

Sign up to best of Tech news, informed analysis and opinions on what matters to you.

Invalid email address
 We respect your inbox and never send spam. You can unsubscribe from our newsletter at any time.     
Thanks for subscribing!