OpenAI bets on Codex coding agents as universal problem solvers

If a tool that writes code can also do science, then OpenAI has a path to dominating far more than the software industry. That is the core wager chief scientist Jakub Pachocki laid out in an exclusive interview with MIT Technology Review - and the logic is disarmingly simple.

Pachocki said daily workflows at OpenAI have been unrecognizable from a year ago. Most technical staff now use Codex, the company's agent-based tool that can write and execute code, analyze documents, and generate charts on the fly. His game plan, as "Hvylya" reports citing the MIT Technology Review interview, is to supercharge those capabilities and apply them across math, physics, biology, chemistry, and policy.

GPT-5 - the model powering Codex - has already been used to crack previously unsolved math problems. Pachocki said the results have exceeded what most doctoral researchers could achieve in weeks, and he expects the pace to accelerate further as models improve.

He acknowledged that OpenAI could build "an amazing automated mathematician" with current tools but chose not to. "That lets you prove that the technology works before you connect it to the real world," he said. "We are much more focused now on research that's relevant in the real world." The real-world impact of agent-based AI has already been quantified elsewhere: a Georgetown study found that AI agents embedded in the Pentagon's Maven system let 20 people do the work of 2,000.

Doug Downey, a research scientist at the Allen Institute for AI, agreed the concept is compelling. "It would be exciting if we could come back tomorrow morning and the agent's done a bunch of work and there's new results we can examine," he said. But he warned that scaling from coding to broader science is harder than it looks. When his team tested top-tier LLMs on a range of scientific tasks last summer, even the best model still made frequent mistakes.

The competition is intense. Anthropic has disclosed that its AI already writes the vast majority of its own training code, and Google DeepMind has been pursuing automated scientific discovery since its founding. OpenAI is betting that reasoning models - which train LLMs to work through problems step by step, backtracking on mistakes - will close the remaining gap. Pachocki expressed confidence that these models will continue to improve.

Also read: "A Gradient, Not a Wall": Why Anthropic Rejected the Pentagon Compromise OpenAI Later Accepted.

OpenAI Bets Its Future on Turning Coding Agents Into Universal Problem Solvers

Ukraine's Invisible Barrier: Why Wounded Veterans Face Isolation in Their Own Cities

A British Official's Stark Warning: Ukraine's War Is Coming for "Sleepy, Barely Defended Britain"

"Hereditary Great Power": Kofman Exposed the Delusion and Strategic Culture Behind Russia's Invasion

The Putin Successor Trap: Why the West Must Resist Its Worst Instinct

Dutch Court Delivers a Blow to Prosecutors in One of the Country's Biggest Security Scandals

Trump Issues Trillion-Dollar Ultimatum to Gulf States Over Iran Conflict

Iran War Risks Draining the Very Power Washington Needs to Counter Beijing

Pentagon Pulls Marines and Fighter Jets From Asia to Iran, Tilting Indo-Pacific Balance

US Burns Through Expensive Missile Stocks in Iran as China Watches and Calculates

Poland Spends 4.48% of GDP on Defense - PISM Warns Even 5% Won't Close the Baltic Gap

Strategic Disarmament Without Occupation: The WWII Playbook Behind the Iran Campaign

Baltic Region Broke Free From Russian Energy in Two Years: Why It Remained Vulnerable

"Death Spasm, Not Expansion": What Iran's Proxy Attacks Actually Reveal About Tehran's Control

Harvard's Longest-Serving Professor Revealed Why Trump Is "More Democratic Than the Rest of Us"

Trump's Europe Pullback Meets Russia's 3-5 Year War Timeline: PISM Identifies a Dangerous Gap

From 350 Missiles to 25: The Numbers Behind Iran's Collapsing Strike Power

South Korea Confirms US May Relocate Air Defense Assets From Asia to Middle East

"Campaign Ahead of Diplomacy": What Critics of the Iran War Are Completely Missing

Blockade Backfires: How Closing the Strait of Hormuz Became a Devastating Trap for Iran