Anthropic says AI labs need coordinated plan to halt development if risks rise

News Team

13 hours ago

Anthropic said on Thursday frontier AI developers should establish a coordinated, verifiable way to slow down or temporarily pause development if ‌advanced systems begin improving themselves faster than society can manage the risks.

AI ‌that can build itself would be a major development in the history of technology, but “full recursive self-improvement also might increase the risks of humans losing control over AI systems,” the AI startup said.

“If systems are capable of fully building their own successors, the ways we secure them, monitor them, and shape their behavior all grow much more important.”

As an example, ‌Anthropic said that as of ⁠May, more than 80% of the code merged into its codebase was authored by Claude.

It would be “good for the world to have ⁠the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology,” the company said.

However, it cautioned that unilateral or poorly coordinated slowdowns could backfire if less cautious actors continue ‌advancing, potentially reducing overall safety.

It highlighted that a meaningful pause would require agreement among “multiple well-resourced labs” operating at the technological frontier, as well as rules on what conditions would trigger or lift such a pause and who would oversee it.

A unilateral pause by a single company would be easier to implement, Anthropic added, but would have ‌limited impact, primarily shifting leadership rather than fostering broader global deliberation.

Its research arm, Anthropic Institute, plans to study and help build systems that would be necessary to support a slowdown.

In the coming ‌months, Anthropic plans to convene discussions involving policymakers, researchers, civil society groups and other AI firms to examine key questions.

These questions include how to manage AI-related risks such as recursive self-improvement and how to improve mechanisms for coordination.

Last month, Anthropic ‌concluded a fundraising round that valued the company at $965 billion and confidentially filed for a U.S. initial public offering on Monday.