--- name: mcp-advanced-patterns description: Advanced MCP patterns for tool composition, resource management, and scaling. Build custom MCP servers, compose tools, manage resources efficiently. Use when composing MCP tools or scaling MCP servers. version: 1.0.0 author: OrchestKit context: fork agent: llm-integrator tags: [mcp, tools, resources, scaling, servers, composition, 2026] user-invocable: false --- # MCP Advanced Patterns Advanced Model Context Protocol patterns for production-grade MCP implementations. > **FastMCP 2.14.x** (Jan 2026): Enterprise auth, OpenAPI/FastAPI generation, server composition, proxying. Python 3.10-3.13. ## Overview - Composing multiple tools into orchestrated workflows - Managing resource lifecycle and caching efficiently - Scaling MCP servers horizontally with load balancing - Building custom MCP servers with middleware and transports - Implementing auto-enable thresholds for context management ## Tool Composition Pattern ```python from dataclasses import dataclass from typing import Any, Callable, Awaitable @dataclass class ComposedTool: """Combine multiple tools into a single pipeline operation.""" name: str tools: dict[str, Callable[..., Awaitable[Any]]] pipeline: list[str] async def execute(self, input_data: dict[str, Any]) -> dict[str, Any]: """Execute tool pipeline sequentially.""" result = input_data for tool_name in self.pipeline: tool = self.tools[tool_name] result = await tool(result) return result # Example: Search + Summarize composition search_summarize = ComposedTool( name="search_and_summarize", tools={ "search": search_documents, "summarize": summarize_content, }, pipeline=["search", "summarize"] ) ``` ## FastMCP Server with Lifecycle ```python from contextlib import asynccontextmanager from collections.abc import AsyncIterator from dataclasses import dataclass from mcp.server.fastmcp import Context, FastMCP @dataclass class AppContext: """Typed application context with shared resources.""" db: Database cache: CacheService config: dict @asynccontextmanager async def app_lifespan(server: FastMCP) -> AsyncIterator[AppContext]: """Manage server startup and shutdown lifecycle.""" # Initialize on startup db = await Database.connect() cache = await CacheService.connect() try: yield AppContext(db=db, cache=cache, config={"timeout": 30}) finally: # Cleanup on shutdown await cache.disconnect() await db.disconnect() mcp = FastMCP("Production Server", lifespan=app_lifespan) @mcp.tool() def query_data(sql: str, ctx: Context) -> str: """Execute query using shared connection.""" app_ctx = ctx.request_context.lifespan_context return app_ctx.db.query(sql) ``` ## Auto-Enable Thresholds (CC 2.1.9) Configure MCP servers to auto-enable/disable based on context window usage: ```yaml # .claude/settings.json mcp: context7: enabled: auto:75 # High-value docs, keep available longer sequential-thinking: enabled: auto:60 # Complex reasoning needs room memory: enabled: auto:90 # Knowledge graph - preserve until compaction playwright: enabled: auto:50 # Browser-heavy, disable early ``` **Threshold Guidelines:** | Threshold | Use Case | Rationale | |-----------|----------|-----------| | auto:90 | Critical persistence | Keep until context nearly full | | auto:75 | High-value reference | Preserve for complex tasks | | auto:60 | Reasoning tools | Need headroom for output | | auto:50 | Resource-intensive | Disable early to free context | ## Resource Management ```python from functools import lru_cache from datetime import datetime, timedelta from typing import Any class MCPResourceManager: """Manage MCP resources with caching and lifecycle.""" def __init__(self, cache_ttl: timedelta = timedelta(minutes=15)): self.resources: dict[str, Any] = {} self.cache_ttl = cache_ttl self.last_access: dict[str, datetime] = {} def get_resource(self, uri: str) -> Any: """Get resource with access time tracking.""" if uri in self.resources: self.last_access[uri] = datetime.now() return self.resources[uri] resource = self._load_resource(uri) self.resources[uri] = resource self.last_access[uri] = datetime.now() return resource def cleanup_stale(self) -> int: """Remove stale resources. Returns count of removed.""" now = datetime.now() stale = [ uri for uri, last in self.last_access.items() if now - last > self.cache_ttl ] for uri in stale: del self.resources[uri] del self.last_access[uri] return len(stale) ``` ## Horizontal Scaling ```python import asyncio from typing import List class MCPLoadBalancer: """Load balance across multiple MCP server instances.""" def __init__(self, servers: List[str]): self.servers = servers self.current = 0 self.health: dict[str, bool] = {s: True for s in servers} async def get_healthy_server(self) -> str: """Round-robin with health check.""" for _ in range(len(self.servers)): server = self.servers[self.current] self.current = (self.current + 1) % len(self.servers) if self.health[server]: return server raise RuntimeError("No healthy servers available") async def health_check_loop(self): """Periodic health check for all servers.""" while True: for server in self.servers: try: self.health[server] = await self._ping(server) except Exception: self.health[server] = False await asyncio.sleep(30) ``` ## Key Decisions | Decision | Recommendation | |----------|----------------| | Transport | Streamable HTTP for web, stdio for CLI | | Lifecycle | Always use lifespan for resource management | | Composition | Chain tools via pipeline pattern | | Scaling | Health-checked round-robin for redundancy | | Auto-enable | Use auto:N thresholds per server criticality | ## Common Mistakes - No lifecycle management (resource leaks) - Missing health checks in load balancing - Hardcoded server endpoints - No graceful degradation on server failure - Ignoring context window thresholds ## Related Skills - `function-calling` - LLM tool integration patterns - `resilience-patterns` - Circuit breakers and retries - `connection-pooling` - Database connection management - `streaming-api-patterns` - Real-time streaming ## Capability Details ### tool-composition **Keywords:** tool composition, pipeline, orchestration, chain tools **Solves:** - Combine multiple tools into workflows - Sequential tool execution - Tool result passing ### resource-management **Keywords:** resource, cache, lifecycle, cleanup, ttl **Solves:** - Manage resource lifecycle - Implement resource caching - Clean up stale resources ### scaling-strategies **Keywords:** scale, load balance, horizontal, health check, redundancy **Solves:** - Scale MCP servers horizontally - Implement health-checked load balancing - Handle server failures gracefully ### server-building **Keywords:** server, fastmcp, lifespan, middleware, transport **Solves:** - Build production MCP servers - Manage server lifecycle - Configure transports and middleware ### auto-enable-thresholds **Keywords:** auto-enable, context window, threshold, auto:N **Solves:** - Configure MCP auto-enable/disable - Manage context window usage - Optimize MCP server availability