Anthropic Engineering Blog https://anthropic.com/engineering/feed_anthropic_engineering.xml Inside the team building reliable AI systems http://www.rssboard.org/rss-specification python-feedgen https://www.anthropic.com/images/icons/apple-touch-icon.png Anthropic Engineering Blog https://anthropic.com/engineering/feed_anthropic_engineering.xml en Tue, 13 Jan 2026 19:06:11 +0000 Demystifying evals for AI agents https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents The capabilities that make agents useful also make them difficult to evaluate. The strategies that work across deployments combine techniques to match the complexity of the systems they measure. \n https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents Engineering Fri, 09 Jan 2026 00:00:00 +0000 Effective harnesses for long-running agents https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents Agents still face challenges working across many context windows. We looked to human engineers for inspiration in creating a more effective harness for long-running agents. https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents Engineering Wed, 26 Nov 2025 00:00:00 +0000 Introducing advanced tool use on the Claude Developer Platform https://www.anthropic.com/engineering/advanced-tool-use We’ve added three new beta features that let Claude discover, learn, and execute tools dynamically. Here’s how they work. https://www.anthropic.com/engineering/advanced-tool-use Engineering Mon, 24 Nov 2025 00:00:00 +0000 Code execution with MCP: Building more efficient agents https://www.anthropic.com/engineering/code-execution-with-mcp Direct tool calls consume context for each definition and result. Agents scale better by writing code to call tools instead. Here's how it works with MCP. https://www.anthropic.com/engineering/code-execution-with-mcp Engineering Tue, 04 Nov 2025 00:00:00 +0000 Beyond permission prompts: making Claude Code more secure and autonomous https://www.anthropic.com/engineering/claude-code-sandboxing Claude Code's new sandboxing features, a bash tool and Claude Code on the web, reduce permission prompts and increase user safety by enabling two boundaries: filesystem and network isolation. https://www.anthropic.com/engineering/claude-code-sandboxing Engineering Mon, 20 Oct 2025 00:00:00 +0000 Equipping agents for the real world with Agent Skills https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills Claude is powerful, but real work requires procedural knowledge and organizational context. Introducing Agent Skills, a new way to build specialized agents using files and folders. https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills Engineering Thu, 16 Oct 2025 00:00:00 +0000 Effective context engineering for AI agents https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents Context is a critical but finite resource for AI agents. In this post, we explore strategies for effectively curating and managing the context that powers them. https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agents Engineering Mon, 29 Sep 2025 00:00:00 +0000 Building agents with the Claude Agent SDK https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk The Claude Agent SDK is a collection of tools that helps developers build powerful agents on top of Claude Code. In this article, we walk through how to get started and share our best practices. https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk Engineering Mon, 29 Sep 2025 00:00:00 +0000 A postmortem of three recent issues https://www.anthropic.com/engineering/a-postmortem-of-three-recent-issues This is a technical report on three bugs that intermittently degraded responses from Claude. Below we explain what happened, why it took time to fix, and what we're changing. https://www.anthropic.com/engineering/a-postmortem-of-three-recent-issues Engineering Wed, 17 Sep 2025 00:00:00 +0000 Writing effective tools for agents — with agents https://www.anthropic.com/engineering/writing-tools-for-agents Agents are only as effective as the tools we give them. We share how to write high-quality tools and evaluations, and how you can boost performance by using Claude to optimize its tools for itself. https://www.anthropic.com/engineering/writing-tools-for-agents Engineering Thu, 11 Sep 2025 00:00:00 +0000 Desktop Extensions: One-click MCP server installation for Claude Desktop https://www.anthropic.com/engineering/desktop-extensions Desktop Extensions make installing MCP servers as easy as clicking a button. We share the technical architecture and tips for creating good extensions. https://www.anthropic.com/engineering/desktop-extensions Engineering Thu, 26 Jun 2025 00:00:00 +0000 How we built our multi-agent research system https://www.anthropic.com/engineering/multi-agent-research-system Our Research feature uses multiple Claude agents to explore complex topics more effectively. We share the engineering challenges and the lessons we learned from building this system. https://www.anthropic.com/engineering/multi-agent-research-system Engineering Fri, 13 Jun 2025 00:00:00 +0000 Claude Code: Best practices for agentic coding https://www.anthropic.com/engineering/claude-code-best-practices Claude Code is a command line tool for agentic coding. This post covers tips and tricks that have proven effective for using Claude Code across various codebases, languages, and environments. https://www.anthropic.com/engineering/claude-code-best-practices Engineering Fri, 18 Apr 2025 00:00:00 +0000 The \"think\" tool: Enabling Claude to stop and think in complex tool use situations https://www.anthropic.com/engineering/claude-think-tool A new tool that improves Claude's complex problem-solving performance https://www.anthropic.com/engineering/claude-think-tool Engineering Thu, 20 Mar 2025 00:00:00 +0000 Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet https://www.anthropic.com/engineering/swe-bench-sonnet SWE-bench is an AI evaluation benchmark that assesses a model's ability to complete real-world software engineering tasks. https://www.anthropic.com/engineering/swe-bench-sonnet Engineering Mon, 06 Jan 2025 00:00:00 +0000 Building effective agents https://www.anthropic.com/engineering/building-effective-agents We've worked with dozens of teams building LLM agents across industries. Consistently, the most successful implementations use simple, composable patterns rather than complex frameworks. https://www.anthropic.com/engineering/building-effective-agents Engineering Thu, 19 Dec 2024 00:00:00 +0000 Introducing Contextual Retrieval https://www.anthropic.com/engineering/contextual-retrieval For an AI model to be useful in specific contexts, it often needs access to background knowledge. https://www.anthropic.com/engineering/contextual-retrieval Engineering Thu, 19 Sep 2024 00:00:00 +0000