In February 2025, five members of Anthropic's frontier red team were attending a conference in Santa Clara, California, when they received disturbing news: results of a controlled trial indicated that a soon-to-be-released version of Claude could help terrorists make biological weapons.
As "Hvylya" details, citing a TIME investigation into Anthropic, the team sprinted back to their hotel room, flipped a bed on its side to serve as a makeshift desk, and spent hours poring over the test results. After extensive analysis, they still were not sure whether the new product was safe. Anthropic held up the release of Claude 3.7 Sonnet for 10 days - an eternity for a company operating at the frontier of an industry rapidly remaking the world.
Logan Graham, the 31-year-old leader of the red team, recalled the incident as an example of the challenges Anthropic faces daily. The team studies Claude's advanced capabilities and tries to project worst-case scenarios, from cyberattacks to biosecurity threats. Graham's recollection of the scare was characteristically understated: "It was a fun and interesting day."
The episode illustrated a tension at the core of Anthropic's identity. The company is the frontier AI lab with the greatest emphasis on safety - and simultaneously one of the leaders in the race to create ever more powerful versions of a technology that many of its own staff believe could lead to catastrophic outcomes. Dave Orr, Anthropic's head of safeguards, captured the dilemma with a vivid analogy. "We're driving down a cliff road. A mistake will kill you," he said. "Now we're driving at 75 instead of 25."
Graham, for his part, does not sugarcoat the stakes. "Some people's intuition from growing up in a peaceful world is that somewhere there's a room full of adults who know how to fix it," he said. "There are no groups of adults. There is no room in the first place." Anthropic's leadership has said it expects the years from 2026 to 2030 to be the most critical period, with models becoming faster and more capable - possibly faster than humans can manage them.
Also read: Academic Explained Why the Global Panic Over AI Disinformation Proved Largely Unfounded
