Anthropic Warns AI Self-Improvement Is Near

Editor J
Anthropic Warns AI Self-Improvement Is Near

Anthropic's internal data shows AI is accelerating its own development. Warning recursive self-improvement may be near, it urges a global pause mechanism.

The milestone where artificial intelligence designs superior models without human intervention is closer than previously assumed. That is the warning issued by Anthropic in a blog post published on June 4. For the first time, the company shared internal data tracking the self-improvement rates of its own models, arguing that the trajectory points toward recursive self-improvement.

Anthropic's core proposal, however, goes a step further: humanity must secure an international mechanism to slow or temporarily halt frontier AI development. The recommendation has immediately revived a long-standing suspicion—aimed at a company that markets itself on AI safety—that the call for regulation is primarily a tactic to hinder competitors.

The Internal Data Behind Anthropic's Warnings

Anthropic bases its warnings on empirical internal metrics rather than abstract concerns. As of May 2026, Claude authored more than 80% of the code merged into the company's codebase. Before Claude Code was released in research preview in early 2025, that proportion was in the low single digits. The company now routes most new code through an automated Claude code reviewer; in a retrospective, that Claude review would have caught about a third of the bugs behind past production incidents.

The resulting productivity curve is steep. In the second quarter of 2026, a typical engineer merged eight times as much code daily as in 2024. The duration of tasks a model can reliably complete is doubling approximately every four months. In March 2024, Claude Opus 3 handled 4-minute jobs; a year later, Sonnet 3.7 managed 90-minute tasks; and by March 2026, Opus 4.6 cleared 12-hour tasks.

The research numbers stand out most. On an experiment to speed up model-training code, Claude's optimization rate surged from approximately 3-fold in May 2025 to 52-fold in April 2026—a task that typically requires a skilled human researcher four to eight hours to achieve a mere 4-fold improvement.

Three Scenarios for a Self-Improving Future

Anthropic recursive self-improvement report key visual
The key visual from Anthropic's recursive self-improvement report

These metrics ultimately point toward recursive self-improvement: a state where an artificial intelligence designs and trains a more capable successor without human intervention. While Anthropic emphasizes that this milestone has 'not yet been reached and is not inevitable,' it warns that the transition 'could occur sooner than most institutions are prepared for.'

The company outlines three potential trajectories. In the first scenario, the exponential growth curve flattens into an S-curve, leading to stagnant progress. In the second, humans define high-level objectives while AI automates execution, allowing productivity to compound exponentially—enabling a 100-person firm to match the output of 10,000 workers.

The third and most concerning scenario involves full recursive self-improvement. Once AI begins independently building its own successors, the pace of recursive self-improvement is bounded only by available compute. If alignment challenges remain unresolved and minor errors compound, humanity risks losing control. For a company that has built its identity on AI safety, this is the outcome it fears most, and the reason its AI safety research must keep pace with capability.

Verifying an AI Pause: More Challenging than Nuclear Disarmament

This risk has prompted Anthropic to propose an international mechanism for slowing AI development. Co-authors Jack Clark, head of policy, and Marina Favaro, lead of internal research, argue that humanity must possess the capability to slow or pause frontier research, giving regulatory frameworks and safety science time to keep pace. Clark estimates that some models could achieve recursive self-improvement within two years.

However, implementation remains highly problematic. If a single developer halts progress unilaterally, less cautious competitors will simply capture the lead. Consequently, major laboratories worldwide must pause simultaneously under uniform terms and verify compliance. Anthropic references Cold War agreements like the INF Treaty as precedents, yet concedes that 'AI training runs are far easier to hide than missile silos,' rendering verification far more complex than nuclear arms control.

Skeptics remain highly critical of the proposal. David Sacks, an adviser to the Trump administration, has accused Anthropic of attempting 'regulatory capture' to disadvantage open-source models, while other analysts view the warnings as a marketing campaign aimed at investors. As noted by the Wall Street Journal, this warning arrives as Anthropic prepares for a potential IPO at a valuation approaching $1 trillion.

Menu