Anthropic Frontier Red Team Blog https://anthropic.com/feed_anthropic_red.xml Evidence-based analysis about AI's implications for cybersecurity, biosecurity, and autonomous systems http://www.rssboard.org/rss-specification python-feedgen https://www.anthropic.com/images/icons/apple-touch-icon.png Anthropic Frontier Red Team Blog https://anthropic.com/feed_anthropic_red.xml en Wed, 18 Feb 2026 23:08:07 +0000 LLM-discovered 0-days https://red.anthropic.com/2026/zero-days/ AI models can now find high-severity vulnerabilities at scale. This is a moment to empower defenders. We're now using Claude to find and help fix vulnerabilities in open source software. https://red.anthropic.com/2026/zero-days/ Thu, 05 Feb 2026 00:00:00 +0000 AI Models on Realistic Cyber Ranges https://red.anthropic.com/2026/cyber-toolkits-update/ In a recent evaluation of AI models’ cyber capabilities, current Claude models can now succeed at multistage attacks on networks with dozens of hosts using only standard, open-source tools, instead of the custom tools needed by previous generations. https://red.anthropic.com/2026/cyber-toolkits-update/ Fri, 16 Jan 2026 00:00:00 +0000 Finding Bugs with Claude and Property-based Testing https://red.anthropic.com/2026/property-based-testing/ Ensuring that programs are bug-free is one of the most challenging aspects of software engineering. We developed an agent that can efficiently identify bugs in large software projects. Our agent infers general properties of code that should be true, and then applies property-based testing. After extensive manual validation, we are in the process of reporting bugs in top Python packages to their developers, several of which have already been patched. https://red.anthropic.com/2026/property-based-testing/ Wed, 14 Jan 2026 00:00:00 +0000 Experimenting with AI to Defend Critical Infrastructure https://red.anthropic.com/2026/critical-infrastructure-defense/ AI could help defenders of critical infrastructure identify the vulnerabilities that attackers might exploit—and close them before they are exploited. Anthropic has partnered with Pacific Northwest National Laboratory (PNNL) to explore this defensive application of AI, demonstrating both the potential of AI-accelerated defense and the value of public-private partnerships in harnessing AI for national security. https://red.anthropic.com/2026/critical-infrastructure-defense/ Thu, 08 Jan 2026 00:00:00 +0000 Project Vend: Phase Two https://red.anthropic.com/2025/project-vend-2/ In June, we revealed that we'd set up a small shop in our San Francisco office run by an AI shopkeeper. It did not do particularly well. We made some adjustments for phase two of Project Vend. The idea of an AI running a business doesn't seem as far-fetched as it once did. But the gap between 'capable' and 'completely robust' remains wide. https://red.anthropic.com/2025/project-vend-2/ Thu, 18 Dec 2025 00:00:00 +0000 AI Agents Find Smart Contract Exploits https://red.anthropic.com/2025/smart-contracts/ We evaluated AI agents' ability to exploit smart contracts using a new benchmark comprising contracts that were actually exploited. On contracts exploited after the latest knowledge cutoffs, Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 found vulnerabilities worth a combined $4.6 million, a finding that underscores the need for proactive adoption of AI for defense. https://red.anthropic.com/2025/smart-contracts/ Mon, 01 Dec 2025 00:00:00 +0000 Project Fetch https://red.anthropic.com/2025/project-fetch/ How could frontier AI models like Claude reach beyond computers and affect the physical world? One path is through robots. We ran an experiment to see how much Claude helped Anthropic staff perform complex tasks with a robot dog. https://red.anthropic.com/2025/project-fetch/ Wed, 12 Nov 2025 00:00:00 +0000 Building AI for Cyber Defenders https://red.anthropic.com/2025/ai-for-cyber-defenders/ We invested in improving Claude's ability to help defenders detect, analyze, and remediate vulnerabilities in code and deployed systems. This work allowed Claude Sonnet 4.5 to match or eclipse Opus 4.1 in discovering code vulnerabilities and other cyber skills. Adopting and experimenting with AI will be key for defenders to keep pace. https://red.anthropic.com/2025/ai-for-cyber-defenders/ Mon, 29 Sep 2025 00:00:00 +0000 LLMs and Biorisk https://red.anthropic.com/2025/biorisk/ Our work at Anthropic is animated by the potential for AI to advance scientific discovery—especially in biology and medicine. At the same time, AI is fundamentally a dual-use technology. This article explains why we believe that evaluating biorisk and safeguarding against it is a critical element of responsible AI development. https://red.anthropic.com/2025/biorisk/ Fri, 05 Sep 2025 00:00:00 +0000 Developing Nuclear Safeguards for AI https://red.anthropic.com/2025/nuclear-safeguards/ Together with the NNSA and DOE national laboratories, we have co-developed a classifier—an AI system that automatically categorizes content—that distinguishes between concerning and benign nuclear-related conversations with high accuracy in preliminary testing. https://red.anthropic.com/2025/nuclear-safeguards/ Thu, 21 Aug 2025 00:00:00 +0000 Claude Does Cyber Competitions https://red.anthropic.com/2025/cyber-competitions/ Throughout 2025, we have been quietly entering Claude in cybersecurity competitions designed primarily for humans. In many of these competitions Claude did pretty well, often placing in the top 25% of competitors. However, it lagged behind the best human teams at the toughest challenges. https://red.anthropic.com/2025/cyber-competitions/ Sat, 09 Aug 2025 00:00:00 +0000 Cyber Evaluations of Claude 4 https://red.anthropic.com/2025/claude-4-cyber/ We partnered with Pattern Labs on a range of cybersecurity evaluations of Claude Opus 4 and Claude Sonnet 4, with Opus demonstrating especially notable improvement over previous models. https://red.anthropic.com/2025/claude-4-cyber/ Tue, 15 Jul 2025 00:00:00 +0000 Project Vend https://red.anthropic.com/2025/project-vend/index.html We let Claude manage an automated store in our office as a small business for about a month. We learned a lot about the plausible, strange, not-too-distant future in which AI models are autonomously running things in the real economy. https://red.anthropic.com/2025/project-vend/index.html Fri, 27 Jun 2025 00:00:00 +0000 Cyber Toolkits for LLMs https://red.anthropic.com/2025/cyber-toolkits/index.html Large Language Models (LLMs) that are not fine-tuned for cybersecurity can succeed in multistage attacks on networks with dozens of hosts when equipped with a novel toolkit. https://red.anthropic.com/2025/cyber-toolkits/index.html Fri, 13 Jun 2025 00:00:00 +0000