# lauren-ai — Full API Reference > First-party AI/LLM companion for the Lauren web framework. Version: see `lauren_ai.__version__` Requires: Python ≥ 3.11, lauren ≥ 0.1.0 --- ## Installation ```bash pip install "lauren-ai[anthropic]" # Anthropic Claude (default) pip install "lauren-ai[openai]" # OpenAI / Ollama pip install "lauren-ai[all]" # All providers + all extras ``` Extras available: `anthropic`, `openai`, `ollama`, `knowledge`, `eval`, `dev`. --- ## Package layout ``` lauren_ai/ ├── __init__.py # Public API surface ├── _agents/ # @agent(), @use_tools(), AgentRunner, AgentContext, AgentResponse ├── _config.py # LLMConfig, AgentConfig ├── _eval/ # AccuracyEval, TrajectoryEval, PerformanceEval ├── _exceptions.py # Exception hierarchy ├── _extractors.py # Lauren extractor integration (Agent[T], Embed[T], StreamCompletion[T]) ├── _guards.py # Guard factories (token_budget_guard, requires_capability, safety_guard) ├── _interceptors.py # ai_metrics_interceptor, token_usage_response_interceptor ├── _chains/ # Chain, Runnable, RunnableLambda, chain() ├── _cost/ # CostTracker, PricingTable, TokenBudget, RateLimiter ├── _guardrails/ # @use_guardrails() (agent), @guardrail() (injectable), TopicFilter, PIIRedactor, LengthFilter, PromptInjectionFilter ├── _knowledge/ # KnowledgeBase, TextLoader, FixedSizeChunker, SentenceChunker ├── _memory/ # ShortTermMemory + UserMemoryStore + @remember() ├── _middleware.py # Lauren middleware integration (conversation_middleware, ai_rate_limit) ├── _module.py # LLMModule, AgentModule, LLMService, EmbedService ├── _output_parsers/ # StrOutputParser, JSONOutputParser, PydanticOutputParser, RetryOutputParser ├── _prompts/ # PromptTemplate, ChatPromptTemplate, FewShotPromptTemplate ├── _routing/ # SemanticRouter, Route, RouteMatch ├── _signals.py # SignalBus + signal dataclasses ├── _skills/ # WebSearchTool, HttpFetchTool, CodeExecutionTool ├── _teams/ # @team(), TeamRunner, TeamResult, TeamMemory ├── _tools/ # @tool(), ToolMeta, ToolRegistry, ToolExecutor, ToolContext, ToolResult ├── _tracing/ # @traced(), Span, Trace, TraceStore, TraceExporter ├── _transport/ # Transport protocol, AnthropicTransport, StructuredLLM, multimodal types ├── _workflows/ # Workflow, Step, Parallel, Condition, Loop └── testing.py # AgentTestClient ``` --- ## Configuration ### `LLMConfig` ```python from lauren_ai import LLMConfig # Direct construction — provider is required config = LLMConfig( provider="anthropic", # "anthropic" | "openai" | "ollama" | "litellm" (required) model="claude-opus-4-6", # Model ID (required) api_key=None, # API key; falls back to provider env var when None base_url=None, # Override provider base URL (proxies, self-hosted, Ollama) max_tokens=4096, # Max output tokens (default: 4096) temperature=1.0, # Sampling temperature 0.0–2.0 (default: 1.0) timeout=60.0, # Request timeout in seconds (default: 60.0) max_retries=3, # Retries on transient errors (default: 3) cache_system_prompt=False, # Anthropic prompt caching for system prompt cache_tools=False, # Anthropic prompt caching for tool definitions embed_model=None, # Embedding model (defaults to model when None) embed_dimensions=None, # Desired embedding dimensionality ) # Recommended factory methods config = LLMConfig.for_anthropic(model="claude-opus-4-6") # reads ANTHROPIC_API_KEY config = LLMConfig.for_openai(model="gpt-4o") # reads OPENAI_API_KEY config = LLMConfig.for_ollama(model="llama3.2") # base_url="http://localhost:11434" cfg, mock = LLMConfig.for_testing() # returns (LLMConfig, MockTransport) ``` ### `AgentConfig` ```python from lauren_ai import AgentConfig config = AgentConfig( system_prompt="You are a helpful assistant.", # Default system prompt (default shown) max_turns=10, # Max agentic loop iterations (default: 10) max_tokens_per_turn=4096, # Output token cap per turn (default: 4096) temperature=1.0, # Per-agent temperature override (default: 1.0) memory_window_tokens=40_000, # Sliding history window in tokens (default: 40_000) max_cost_usd=None, # Hard USD budget cap per run (None = unlimited) parallel_tool_calls=False, # Execute tool calls concurrently (default: False) tool_error_policy="return_error", # "return_error" | "raise" | "skip" # Extended thinking (Anthropic only — silently ignored elsewhere) thinking=False, thinking_budget_tokens=8_000, # OpenAI reasoning models reasoning_effort=None, # "low" | "medium" | "high" | None include_reasoning_in_response=False, ) ``` --- ## Transport ### `Transport` protocol ```python from lauren_ai._transport import Transport class MyTransport(Transport): async def complete(self, messages, *, model, **kwargs) -> Completion: ... async def complete_stream(self, messages, *, model, **kwargs) -> AsyncIterator[CompletionChunk]: ... async def embed(self, texts, *, model, **kwargs) -> list[Embedding]: ... async def count_tokens(self, messages, *, model) -> int: ... ``` ### `AnthropicTransport` ```python from lauren_ai._transport._anthropic import AnthropicTransport transport = AnthropicTransport(config) # Takes a LLMConfig ``` ### `MockTransport` ```python from lauren_ai._transport._mock import MockTransport mock = MockTransport() mock.queue_response(completion) # Queue a Completion object mock.queue_tool_use("tool_name", {"k": v}) # Queue a tool_use response + follow-up completion mock.queue_error(RuntimeError("oops")) # Queue an error to be raised mock.reset() # Clear queue and call history # After run: print(mock.calls) # list[dict] — each complete() call recorded print(mock.call_count) # int ``` ### Core transport types ```python from lauren_ai import Message, Completion, TokenUsage, ToolCall, ContentBlock # Messages msg = Message(role="user", content="Hello!") msg = Message.user("Hello!") # convenience factory msg = Message.assistant("Hi!") # convenience factory msg = Message.from_multimodal("user", [parts...]) # multimodal message # ContentBlock factories block = ContentBlock.text_block("Hello!") block = ContentBlock.tool_use_block(name="get_weather", tool_input={"city": "Paris"}) block = ContentBlock.tool_result_block(tool_use_id="toolu_abc", content="22°C sunny") # Completion result = Completion( id="comp_1", model="claude-opus-4-6", content="Hello!", tool_calls=[], stop_reason="end_turn", usage=TokenUsage(input_tokens=10, output_tokens=5), thinking_blocks=[], # list[ThinkingBlock | RedactedThinkingBlock] — Anthropic only ) # ThinkingBlock / RedactedThinkingBlock (Anthropic extended thinking) from lauren_ai._transport import ThinkingBlock, RedactedThinkingBlock block = ThinkingBlock(thinking="Let me reason through this...", signature="ant-sig-abc") # block.thinking — str — the model's reasoning text # block.signature — str — Anthropic cryptographic signature redacted = RedactedThinkingBlock(data="") # redacted.data — str — opaque base64 blob (safety-redacted content) # TokenUsage usage = TokenUsage(input_tokens=100, output_tokens=50) print(usage.total_tokens) # 150 (property) print(usage.cost_usd("claude-opus-4-6")) # float # ToolCall call = ToolCall(id="toolu_abc", name="get_weather", input={"city": "Paris"}) ``` --- ## Tools ### `@tool()` decorator ```python from lauren_ai import tool, ToolContext, ToolResult @tool() async def get_weather(city: str, ctx: ToolContext | None = None) -> dict: """Return current weather for a city. Args: city: The city name to look up. Returns current temperature and conditions. """ return {"city": city, "temperature": 22, "condition": "sunny"} ``` Rules: - **Must use parentheses**: `@tool()` not `@tool` (raises `DecoratorUsageError`) - Schema generated from PEP-3107 annotations + Google-style `Args:` docstring section - Any parameter annotated `ToolContext` or `ToolContext | None` is injected at runtime and **excluded from the JSON schema** — the parameter may be **named anything** - `from __future__ import annotations` is supported, but tool annotations must still resolve when `@tool()` builds the schema. Avoid unresolved forward refs or circular imports in function-form tool files. - Optional params (with defaults) are excluded from the `required` array - Class-form tools receive `@injectable(scope=SINGLETON)` automatically; override with `@tool() @injectable(scope=Scope.REQUEST)` for request-scoped tools - Subclasses of `@tool()`-decorated classes must re-apply `@tool()` — the `ToolRegistry` raises `MetadataInheritanceError` for inherited-but-not-redeclared tools ### `@tool()` with class form (DI-injectable) ```python @tool() class DelegateToResearcher: """Delegate a research task to the ResearchAgent. Args: task: The research task description. """ def __init__(self, research: ResearchAgent) -> None: self._research = research async def run(self, task: str) -> dict: """Run the delegation.""" ... ``` Class-form tools: - Are registered as DI providers — dependencies are injected by the container - Their `run()` method is the entry point (must be defined) - May use `from __future__ import annotations`, provided the referenced types resolve when schema generation runs ### `@tool()` with options ```python @tool( name="weather", # Override tool name (default: function/class name) description="Get current weather", # Override description (default: from docstring) requires_confirmation=True, # Pause for human-in-the-loop approval before execution cache_ttl=300, # Cache results for 300 seconds pre_hook=my_pre_hook, # Called before execution post_hook=my_post_hook, # Called after successful execution error_hook=my_error_hook, # Called on exception ) async def get_weather(city: str) -> dict: ... ``` ### `ToolResult` ```python from lauren_ai import ToolResult # Success result = ToolResult.ok("The weather is sunny.", tool_use_id="toolu_abc") result = ToolResult.ok({"temp": 22}, tool_use_id="toolu_abc") # dict → JSON string # Error result = ToolResult.error("City not found.", tool_use_id="toolu_abc") print(result.content) # str print(result.is_error) # bool ``` ### `ToolContext` ```python from lauren_ai import ToolContext # Injected by runner into any parameter annotated as ToolContext — name doesn't matter @tool() async def my_tool(query: str, ctx: ToolContext | None = None) -> dict: """...""" # ctx.agent_context — AgentContext for the running agent # ctx.tool_use_id — Provider-assigned tool call identifier for this invocation # ctx.turn — Agentic loop iteration (0-based) that triggered this call # ctx.request — Originating HTTP Request, or None # ctx.execution_context — lauren ExecutionContext (route/handler metadata), or None # ctx.state — Mutable dict[str, Any] for per-call local storage # ctx.get_metadata(key, default=None) — Read from agent_context.metadata return {"result": query} ``` --- ## Agents ### `@agent()` decorator ```python from lauren_ai import agent, AgentConfig @agent( model="claude-opus-4-6", # LLM model ID (overrides LLMConfig) system="You are a helpful assistant.", # System prompt for this agent config=AgentConfig(max_turns=20), # Runtime behaviour overrides description="Answers questions", # Human-readable description ) class AssistantAgent: ... ``` ### `@use_tools()` decorator ```python from lauren_ai import use_tools @agent(model="claude-opus-4-6") @use_tools(get_weather, search_web, calculate) class PlanningAgent: ... ``` **Critical ordering**: `@agent()` is the **outermost** (topmost) decorator, `@use_tools()` is below it. Python applies decorators bottom-up so `@use_tools()` runs first and sets `USE_TOOLS_META`; `@agent()` reads it. ### Lifecycle hooks ```python from lauren_ai import agent, AgentContext, AgentResponse @agent(model="claude-opus-4-6") class MyAgent: async def on_start(self, ctx: AgentContext) -> None: """Called before the first turn.""" ... async def on_finish(self, response: AgentResponse, ctx: AgentContext) -> None: """Called after the final turn.""" ... ``` ### `AgentContext` ```python # Passed to lifecycle hooks; also accessible from tools via ctx.agent_context # AgentContext fields: # .agent_id — Unique identifier for this agent instance (random hex) # .agent_run_id — Unique identifier for this specific run (random hex) # .agent_class — The @agent()-decorated class # .config — Effective AgentConfig for this run # .memory — ShortTermMemory for this conversation # .turn — Current loop iteration (0-based) # .metadata — dict[str, Any] passed via runner.run(metadata=...) # .request — Originating HTTP Request, or None # .execution_context — lauren ExecutionContext (route/handler metadata), or None # .signals — SignalBus, or None # .get_metadata(key, default=None) — convenience accessor for .metadata ``` ### `AgentResponse` ```python # Returned by AgentRunner.run() # .content — str — final text output from the agent # .turns — int — agentic loop iterations executed # .total_usage — TokenUsage — cumulative tokens across all turns # .tool_calls_made — list[ToolCall] — all tool executions during the run # .stop_reason — "end_turn" | "max_turns" | "budget_exceeded" | "error" # .metadata — dict[str, Any] # .reasoning_traces — list[str] — extended thinking traces (Anthropic only) # await .as_stream() — AsyncIterator[str] yielding content as a single item ``` ### `AgentRunner` (Protocol) / `AgentRunnerBase` (concrete) `AgentRunner` is a `@runtime_checkable Protocol` — the structural interface for all runner implementations. `AgentRunnerBase` is the concrete class that implements it. In production, runners are created and injected by `AgentModule.for_root()`. For testing, instantiate `AgentRunnerBase` directly: ```python from lauren_ai import AgentRunner, AgentRunnerBase # Testing: instantiate the concrete class runner = AgentRunnerBase( transport=transport, # Transport instance signals=signal_bus, # Optional SignalBus cache_backend=None, # Optional cache backend for tool results ) isinstance(runner, AgentRunner) # True — Protocol is @runtime_checkable isinstance(runner, AgentRunnerBase) # True # Run an agent response = await runner.run( agent_instance, # @agent()-decorated class instance "What's the weather?", # User message string conversation_id="sess-1", # Optional session ID metadata={"user_id": "u1"}, # Optional metadata injected into AgentContext request=None, # Optional HTTP Request execution_context=None, # Optional lauren ExecutionContext (forwarded to ToolContext) run_id=None, # Optional explicit run ID (random hex if omitted) ) ``` ### Streaming ```python async for chunk in runner.run_stream(agent_instance, "Hello"): if chunk.thinking_delta is not None: print("[thinking]", chunk.thinking_delta, end="", flush=True) elif chunk.delta: print(chunk.delta, end="", flush=True) # chunk.delta — str | None — incremental response text # chunk.thinking_delta — str | None — incremental thinking text (Anthropic only) ``` --- ## Extended Thinking ### Anthropic extended thinking ```python from lauren_ai import agent, AgentConfig @agent( model="claude-opus-4-6", system="You are a careful analyst.", thinking=True, thinking_budget_tokens=10_000, # tokens the model may spend on internal reasoning ) class AnalystAgent: ... ``` > **Temperature is suppressed when `thinking=True`.** Anthropic's API does not > accept a temperature parameter when extended thinking is enabled. `AgentRunner` > detects `thinking=True` and omits temperature from the call entirely, regardless > of what `AgentConfig.temperature` is set to. Supported models: `claude-opus-4-6`, `claude-sonnet-4-6`, `claude-haiku-4-5`. Older pinned-version model IDs (e.g. `claude-3-opus-20240229`) do not support it. `AgentResponse.reasoning_traces` — `list[str]` — flat list of thinking text collected across all agentic-loop turns. `Completion.thinking_blocks` — `list[ThinkingBlock | RedactedThinkingBlock]` — raw per-turn blocks before flattening. ```python response = await runner.run(agent_inst, "Analyse the pros and cons of microservices.") print(response.content) # final answer print(response.reasoning_traces) # list[str] — one entry per thinking block ``` `thinking_budget_tokens` guidance: | Budget | Use case | |---|---| | `2_000 – 5_000` | Simple reasoning, quick checks | | `8_000` | Default — balanced | | `16_000` | Complex multi-step analysis | | `32_000` | Very hard problems (significant cost increase) | The budget is a ceiling, not a target. ### OpenAI reasoning models (o1 / o3) ```python @agent( model="o3", # or "o1", "o1-mini", "o3-mini" reasoning_effort="high", # "low" | "medium" | "high" | None include_reasoning_in_response=True, # expose reasoning in AgentResponse ) class ReasoningAgent: ... ``` `reasoning_effort` is silently ignored by non-o-series OpenAI models and by Anthropic / Ollama transports. --- ## Memory ### `ShortTermMemory` ```python from lauren_ai import ShortTermMemory mem = ShortTermMemory(max_tokens=40_000) mem.add_user("Hello!") mem.add_assistant(completion) # Completion object mem.add_tool_result(tool_result) # ToolResult object msgs = mem.messages() # list[dict] — ready to pass to transport snapshot = mem.snapshot() # Save state mem.restore(snapshot) # Restore state mem.clear() # Clear all messages ``` ### `InMemoryConversationStore` ```python from lauren_ai import InMemoryConversationStore store = InMemoryConversationStore() await store.save("conv-1", messages) loaded = await store.load("conv-1") # [] if missing await store.delete("conv-1") ``` When `AgentRunner` is configured with a `conversation_store` and `runner.run()` is called with a `conversation_id`, the runner automatically loads the prior history before adding the new user message and saves the updated history after the run completes. Wire it through `AgentModule.for_root()`: ```python AIModule = AgentModule.for_root( agents=[MyAgent], conversation_store=InMemoryConversationStore(), imports=LLMProvider, ) # Then in your controller: resp = await runner.run(agent, "Hello", conversation_id="sess-1") ``` Without a `conversation_store` the `conversation_id` parameter is accepted but unused — each call starts with an empty `ShortTermMemory`. ### `InMemoryVectorStore` ```python from lauren_ai import InMemoryVectorStore store = InMemoryVectorStore() doc_id = await store.upsert("Some text content.", metadata={"tag": "docs"}) results = await store.search("query text", k=5) result = await store.get(doc_id) # MemoryResult | None await store.delete([doc_id]) # Takes a list await store.clear() # MemoryResult fields: # result.id — str # result.content — str # result.score — float (0..1, cosine similarity) # result.metadata — dict ``` --- ## Knowledge Base ```python from lauren_ai import KnowledgeBase, TextLoader, FixedSizeChunker, SentenceChunker from lauren_ai import InMemoryVectorStore # Build a knowledge base kb = KnowledgeBase( store=InMemoryVectorStore(), chunker=FixedSizeChunker(chunk_size=512, overlap=64), ) # Load documents n = await kb.load(TextLoader("docs/faq.txt")) # from file path n = await kb.load(TextLoader("raw text here", is_file=False)) # from string # Search results = await kb.search("How do I reset my password?", top_k=5) # Use as agent tool — manual pattern tool_fn = kb.as_tool(name="search_knowledge_base", top_k=5) @agent(model="claude-opus-4-6") @use_tools(kb.as_tool()) class SupportAgent: ... # Module-level pattern — auto-attach KB to every agent in the module. # No @use_tools needed; the framework calls .as_tool() internally. # When loaders= is supplied, the framework also generates a singleton # @post_construct hook that loads them at app startup — no asyncio.run # at module-import time, safe inside any async context. from lauren_ai._knowledge import KnowledgeSource AIModule = AgentModule.for_root( agents=[SupportAgent], imports=LLMProvider, knowledge=[ KnowledgeSource( kb=KnowledgeBase(store=InMemoryVectorStore(), chunker=SentenceChunker()), tool_name="search_manual", top_k=3, loaders=[TextLoader("docs/manual.txt")], # loaded at app startup ), ], ) # Pre-populated KB — pass the bare instance, no loaders=: # (caller has already done: await kb.load(TextLoader(...))) AIModule = AgentModule.for_root( agents=[SupportAgent], imports=LLMProvider, knowledge=[kb], # default tool name ) ``` ### Chunkers | Class | Description | |---|---| | `FixedSizeChunker(chunk_size=512, overlap=64)` | Split at fixed character count | | `SentenceChunker(max_chunk_size=512)` | Split at sentence boundaries | --- ## Signals ```python from lauren_ai import ( SignalBus, ModelCallStarted, ModelCallComplete, ToolCallStarted, ToolCallComplete, AgentRunComplete, ) bus = SignalBus() @bus.on(ModelCallComplete) async def on_model_call(event: ModelCallComplete) -> None: print(f"model={event.model} cost=${event.cost_usd:.6f}") print(f"tokens={event.usage.total_tokens} stop={event.stop_reason}") @bus.on(ToolCallStarted) async def on_tool_start(event: ToolCallStarted) -> None: print(f"calling tool: {event.tool_name}({event.input})") @bus.on(ToolCallComplete) async def on_tool_end(event: ToolCallComplete) -> None: print(f"tool done: success={event.success}") # Remove handler bus.off(ModelCallComplete, on_model_call) # Emit manually (rarely needed) await bus.emit(ModelCallStarted(model="mock", messages_count=1)) ``` ### Signal fields | Signal | Fields | |---|---| | `ModelCallStarted` | `timestamp, model, agent_id, agent_class, messages_count, input_tokens_estimate` | | `ModelCallComplete` | `timestamp, model, agent_id, agent_class, usage, duration_ms, stop_reason, cost_usd` | | `ToolCallStarted` | `timestamp, tool_name, tool_use_id, input, agent_class` | | `ToolCallComplete` | `timestamp, tool_name, tool_use_id, success, error_message, duration_ms` | | `AgentRunComplete` | `timestamp, turns, agent_class, total_usage, total_cost_usd, stop_reason` | --- ## Modules ### `LLMModule` ```python from lauren_ai import LLMModule, LLMConfig LLMProvider = LLMModule.for_root( LLMConfig.for_anthropic(), transport_override=None, # Optional custom Transport (useful in tests) ) # In your Lauren app: @module(imports=[LLMProvider]) class AppModule: ... # Access the built service before startup (class attributes): # LLMModule.for_root() sets: # .transport_instance — the Transport object # .llm_service_instance — the LLMService object ``` ### `AgentModule` **Rule: every `AgentModule.for_root()` call MUST have its own dedicated runner.** Each call auto-generates a unique `AgentRunnerBase` subclass as the module's runner token. Inject it with `runner: AgentRunner` in any provider that belongs to the same module — the DI container resolves it via structural Protocol scan. ```python from lauren_ai import AgentModule, AgentRunner AgentProvider = AgentModule.for_root( agents=[WeatherAgent, SupportAgent], tools=[get_weather, search_docs], # function-form and class-form tools imports=LLMProvider, # LLMModule (or list of modules) signals=signal_bus, # Optional SignalBus ) @module(imports=[LLMProvider, AgentProvider]) class AppModule: ... # Providers inside AgentProvider's scope inject the runner with no ceremony: @injectable() class ChatService: runner: AgentRunner # → the auto-generated runner for this module ``` `AgentModule.for_root()` registers: - All `@agent()` classes as `Scope.SINGLETON` DI providers - Function-form tools directly in the runner's tool map - Class-form tools as DI providers (own-module only — not re-exported by default) - A unique `AgentRunnerBase` subclass via `use_factory` (injecting Transport + LLMConfig) #### Sharing a tool across multiple AgentModules (`shared_tools`) When the same `@tool()` class is used by agents in more than one `AgentModule`, declaring it as a provider in each module raises `ModuleExportViolation` at startup. The solution is a dedicated ownership module that provides and exports the tool, combined with `shared_tools=` in every `AgentModule` that imports it: ```python from lauren import module from app.ai.check_auth_tool import CheckAuthenticationTool # Ownership module — declares and exports the tool once. @module(providers=[CheckAuthenticationTool], exports=[CheckAuthenticationTool]) class CheckAuthModule: ... # Each AgentModule imports CheckAuthModule and lists the tool in shared_tools= # so the framework skips re-registering it as a provider here. UnauthMod = AgentModule.for_root( agents=[UnauthenticatedCRMAgent], imports=[LLMProvider, CheckAuthModule], shared_tools=[CheckAuthenticationTool], # owned by CheckAuthModule ) AuthMod = AgentModule.for_root( agents=[AuthenticatedCRMAgent], imports=[LLMProvider, CheckAuthModule], shared_tools=[CheckAuthenticationTool], # owned by CheckAuthModule ) ``` `shared_tools` only suppresses the DI *declaration* — agents in both modules can still call `CheckAuthenticationTool` normally. The singleton is provided once by `CheckAuthModule` and resolved via the import chain. #### Multi-module runner disambiguation (advanced) When a controller or service sits in a scope that **imports two or more AgentModules**, `runner: AgentRunner` is ambiguous — both runners are visible and the DI container raises `ProtocolAmbiguityError`. Each module MUST define a dedicated named runner class and pass it via `runner=MyRunner`: ```python from lauren import injectable, Scope from lauren_ai import AgentRunnerBase @injectable(scope=Scope.SINGLETON) class TransferAgentRunner(AgentRunnerBase): """Distinct DI token for the Transfer module's runner.""" @injectable(scope=Scope.SINGLETON) class CRMAgentRunner(AgentRunnerBase): """Distinct DI token for the CRM module's runner.""" TransferMod = AgentModule.for_root( agents=[TransferAgent], tools=[...], runner=TransferAgentRunner, # explicit runner token imports=[LLMProvider], ) CRMMod = AgentModule.for_root( agents=[CRMAgent], tools=[DelegateToBankingTransfer], runner=CRMAgentRunner, # explicit runner token imports=[LLMProvider, TransferMod], ) # Controller that needs both runners uses concrete types — no ambiguity: class ChatController: def __init__(self, runner: CRMAgentRunner, transfer_runner: TransferAgentRunner): ... ``` `isinstance(crm_runner, AgentRunner)` and `isinstance(transfer_runner, AgentRunner)` both remain `True` — the named classes are concrete DI tokens only; `AgentRunner` is still the shared Protocol interface. You can always define your own runner subclass to get a stable, named token for multi-module wiring. ### `LLMService` Injected automatically when `LLMModule` is imported. ```python from lauren_ai import LLMService, Message class MyController: def __init__(self, llm: LLMService) -> None: self._llm = llm async def chat(self, text: str) -> str: result = await self._llm.complete( [Message.user(text)] ) return result.content async def stream(self, text: str): async for chunk in await self._llm.complete( [Message.user(text)], stream=True ): yield chunk.delta async def embed_text(self, text: str) -> list[float]: embeddings = await self._llm.embed([text]) return embeddings[0].vector def with_structured_output(self, schema_class) -> StructuredLLM: return self._llm.with_structured_output(schema_class) ``` --- ## Testing ### `AgentTestClient` ```python from lauren_ai.testing import AgentTestClient from lauren_ai import Completion, TokenUsage from lauren_ai._transport._mock import MockTransport mock = MockTransport() mock.queue_tool_use("get_weather", {"city": "Paris"}) mock.queue_response(Completion( id="c2", model="mock", content="It's sunny in Paris!", tool_calls=[], stop_reason="end_turn", usage=TokenUsage(input_tokens=50, output_tokens=10), )) client = AgentTestClient(WeatherAgent, mock) # Sync run response = client.run("What's the weather in Paris?") assert "sunny" in response.content assert len(client.calls) == 2 # Two model calls: tool-use turn + final answer # Async run response = await client.run_async("Weather?") # Reset state between tests client.reset() assert client.calls == [] ``` ### `LLMConfig.for_testing()` pattern ```python from lauren_ai import LLMConfig, Completion, TokenUsage cfg, mock = LLMConfig.for_testing() mock.queue_response(Completion( id="t1", model="mock-model", content="Hello from mock!", tool_calls=[], stop_reason="end_turn", usage=TokenUsage(input_tokens=10, output_tokens=5), )) LLMProvider = LLMModule.for_root(cfg, transport_override=mock) ``` --- ## Interceptors ```python from lauren_ai import ai_metrics_interceptor, token_usage_response_interceptor from lauren import use_interceptors, module @module(imports=[LLMProvider]) @use_interceptors(ai_metrics_interceptor(), token_usage_response_interceptor()) class AppModule: ... ``` - `ai_metrics_interceptor()` — reads `request.state.ai_token_usage` and emits metrics - `token_usage_response_interceptor()` — adds `x-token-usage` and `x-ai-cost-usd` response headers --- ## Guards ```python from lauren_ai import token_budget_guard, requires_capability, safety_guard, SafetyPolicy from lauren import use_guards, controller @use_guards( token_budget_guard(max_tokens_per_window=100_000, window_seconds=3600), ) @controller("/api") class ChatController: ... ``` | Guard factory | Description | |---|---| | `token_budget_guard(max_tokens_per_window, window_seconds)` | Enforce per-window token budget | | `requires_capability(capability)` | Require a named model capability in request state | | `safety_guard(policy)` | Block requests that violate a `SafetyPolicy` | --- ## Workflows ```python from lauren_ai._workflows import Workflow, Step, Parallel, Condition, Loop async def summarise(ctx): text = ctx["text"] result = await llm_service.complete([Message.user(f"Summarise: {text}")]) return {"summary": result.content} async def translate(ctx): lang = ctx.get("lang", "Spanish") result = await llm_service.complete([Message.user(f"Translate to {lang}: {ctx['summary']}")]) return {"translation": result.content} pipeline = ( Workflow() .then(Step("summarise", summarise)) .then(Parallel( Step("translate_es", translate), Step("translate_fr", translate), )) ) result = await pipeline.run({"text": long_document}) print(result.outputs["translate_es"]) ``` --- ## Evaluation ```python from lauren_ai._eval import AccuracyEval, EvalDataset, EvalExample dataset = EvalDataset(examples=[ EvalExample(input="What is 2+2?", expected_output="4"), EvalExample(input="Capital of France?", expected_output="Paris"), ]) eval_suite = AccuracyEval(agent=MyAgent, dataset=dataset) report = await eval_suite.run() print(f"Pass rate: {report.pass_rate:.1%}") # e.g. "100.0%" report.assert_pass_rate(0.9) # Raises AssertionError if below 90% ``` --- ## Built-in Skills ```python from lauren_ai._skills import WebSearchTool, HttpFetchTool, CodeExecutionTool @agent(model="claude-opus-4-6") @use_tools(WebSearchTool, HttpFetchTool) class ResearchAgent: ... ``` | Tool class | Description | |---|---| | `WebSearchTool` | Search the web via DuckDuckGo or Google | | `HttpFetchTool` | Make HTTP requests and return response body | | `CodeExecutionTool` | Execute Python code in a sandboxed subprocess | --- ## Exception hierarchy ``` LaurenAIError ├── TransportError │ ├── TransientTransportError (retriable: 429, 503, etc.) │ └── AuthTransportError (401, 403) ├── AgentMaxTurnsError ├── AgentBudgetExceededError ├── AgentConfigError ├── DecoratorUsageError ├── ToolExecutionError ├── ToolConfigError ├── ToolSchemaError ├── KnowledgeLoadError ├── WorkflowError ├── OutputParserError │ └── MaxRetryError ├── EmptyQueueError └── TracingError ``` --- ## Section 32 — Prompt Templates & Chains ```python from lauren_ai import PromptTemplate, ChatPromptTemplate, FewShotPromptTemplate from lauren_ai import FewShotExample, Chain, StrOutputParser # String template tmpl = PromptTemplate("Translate the following to French: {text}") # Chat template chat = ChatPromptTemplate([ ("system", "You are a professional translator."), ("user", "{text}"), ]) # Few-shot template few_shot = FewShotPromptTemplate( examples=[ FewShotExample(input="Good morning", output="Bonjour"), FewShotExample(input="Thank you", output="Merci"), ], template="Translate: {text}", ) # Pipe composition — each element must implement Runnable chain: Chain = tmpl | llm_service | StrOutputParser() result: str = await chain.invoke({"text": "Hello world"}) ``` **Key invariant**: `|` returns a new `Chain`; the operands are never mutated. --- ## Section 33 — Output Parsers ```python from lauren_ai import ( StrOutputParser, JSONOutputParser, RegexParser, CommaSeparatedListParser, MarkdownCodeBlockParser, PydanticOutputParser, RetryOutputParser, ) from pydantic import BaseModel class City(BaseModel): name: str country: str # Parse LLM output as a Pydantic model parser = PydanticOutputParser(City) chain = ChatPromptTemplate([("user", "Name a capital city as JSON.")]) | llm_service | parser city: City = await chain.invoke({}) # Retry up to 3 times if parsing fails safe_parser = RetryOutputParser(parser, max_retries=3, llm=llm_service) # Comma-separated list list_parser = CommaSeparatedListParser() items: list[str] = await list_parser.parse("apples, bananas, cherries") # Extract first fenced code block code_parser = MarkdownCodeBlockParser(language="python") code: str = await code_parser.parse(llm_output) ``` **Key invariant**: Parsers raise `OutputParserError` on failure. `RetryOutputParser` wraps with automatic re-prompting up to `max_retries`. --- ## Section 34 — Agent Teams ```python from lauren_ai import team, TeamRunner, TeamResult, TeamMemory from lauren_ai import ( TeamWorkerStarted, TeamWorkerFinished, TeamCoordinatorDecision, TeamFinalAnswer, ) @agent(model="claude-opus-4-6", system="You research topics.") class ResearchAgent: ... @agent(model="claude-opus-4-6", system="You write clear summaries.") class WriterAgent: ... @team( name="content-pipeline", mode="coordinator", # "coordinator" | "collaborate" max_rounds=5, model="claude-opus-4-6", ) class ContentTeam: def __init__(self, researcher: ResearchAgent, writer: WriterAgent) -> None: self.researcher = researcher self.writer = writer # Run the team runner = TeamRunner( team_cls=ContentTeam, llm=llm_service, # LLMService (from DI or LLMModule) agent_runner=agent_runner, # shared AgentRunner ) result: TeamResult = await runner.run("Write a post about async Python.") print(result.final_answer) # str — the team's final output print(result.rounds) # int print(result.worker_outputs) # dict[str, str] — per-agent contributions # Stream team events async for event in runner.run_stream("Write a post about async Python."): if isinstance(event, TeamWorkerStarted): print(f"[R{event.round}] Starting: {event.worker_name}") elif isinstance(event, TeamWorkerFinished): print(f"[R{event.round}] Done: {event.worker_name} — {event.result_content[:100]}") elif isinstance(event, TeamCoordinatorDecision): print(f"[Coordinator] {event.decision}") elif isinstance(event, TeamFinalAnswer): print(f"[Final] {event.content}") ``` **Key invariants**: - `@team()` must use parentheses; bare `@team` raises `DecoratorUsageError`. - Workers are declared as typed `__init__` parameters on the team class. - `TeamRunner(team_cls, llm, agent_runner)` — `llm` is the `LLMService`. - `run_stream()` is an async generator; iterate with `async for`. --- ## Section 35 — Tracing & Observability ```python from lauren_ai import ( traced, SpanKind, Span, Trace, TraceStore, TracingConfig, InMemoryTraceExporter, ConsoleTraceExporter, FileTraceExporter, set_trace_store, get_trace_store, ) # Configure tracing exporter = InMemoryTraceExporter() config = TracingConfig(service_name="my-app", sample_rate=1.0) store = TraceStore(config=config, exporters=[exporter]) set_trace_store(store) # Decorate an async function @traced(name="summarise", kind=SpanKind.AGENT) async def summarise(text: str) -> str: result = await llm_service.complete([Message.user(f"Summarise: {text}")]) return result.content # After execution: traces: list[Trace] = exporter.traces span: Span = traces[0].root_span print(span.name, span.duration_ms, span.status) # Implement a custom exporter from lauren_ai import TraceExporter class MyExporter(TraceExporter): async def export(self, trace: Trace) -> None: ... ``` **Key invariant**: `@traced()` must use parentheses. Spans are exported asynchronously; the decorated function's return value is unchanged. --- ## Section 36 — Persistent User Memory ```python from lauren_ai import remember, MemoryFact, InMemoryUserMemoryStore store = InMemoryUserMemoryStore() @agent(model="claude-opus-4-6", system="You are a personal assistant.") @remember(store=store, extract=True, inject=True, top_k=5) class PersonalAgent: ... # MemoryFact — stored facts fact = MemoryFact( user_id="user-42", content="The user prefers concise answers.", confidence=0.9, ) # Implement custom storage from lauren_ai import UserMemoryStore class RedisUserMemoryStore(UserMemoryStore): async def save(self, user_id: str, fact: MemoryFact) -> None: ... async def search(self, user_id: str, query: str, top_k: int) -> list[MemoryFact]: ... async def delete(self, user_id: str, fact_id: str) -> None: ... ``` **Key invariant**: `@remember()` must be placed **below** `@agent()` (applied first). `extract=True` adds a post-run hook; `inject=True` prepends facts before each run. --- ## Section 37 — Structured LLM Outputs ```python from lauren_ai import StructuredLLM, LLMService, Message from pydantic import BaseModel class SentimentResult(BaseModel): sentiment: str # "positive" | "negative" | "neutral" confidence: float # 0.0 – 1.0 summary: str # Obtain a typed wrapper from LLMService structured: StructuredLLM[SentimentResult] = llm_service.with_structured_output(SentimentResult) result: SentimentResult = await structured.complete( [Message.user("This product is amazing!")] ) print(result.sentiment, result.confidence) # Also works in a Chain chain = prompt | structured result = await chain.invoke({"review": "Great quality."}) ``` **Key invariant**: Uses tool-calling under the hood; the model must support function/tool calling. Validation errors raise `OutputParserError`. --- ## Section 38 — Multimodal Inputs ```python from lauren_ai import ImageContent, AudioContent, DocumentContent, ContentPart, Message # Image from URL img = ImageContent(url="https://example.com/chart.png") # Image from bytes img_bytes = ImageContent(data=raw_bytes, mime_type="image/png") # Audio audio = AudioContent(data=audio_bytes, mime_type="audio/mp3") # PDF / document doc = DocumentContent(data=pdf_bytes, mime_type="application/pdf") # Build a mixed-content user message msg = Message.from_multimodal("user", [ "Please describe the chart and summarise the document:", img, doc, ]) # Pass directly to LLMService or as part of agent.run() result = await llm_service.complete([msg]) ``` **Key invariant**: Only transports that support multimodal will accept `ImageContent`/`AudioContent`/`DocumentContent`; unsupported transports raise `TransportError` at call time, not at startup. --- ## Section 39 — Semantic Router ```python from lauren_ai import SemanticRouter, Route, RouteMatch async def embed_fn(texts: list[str]) -> list[list[float]]: embeddings = await embed_service.embed(texts) return [e.vector for e in embeddings] router = SemanticRouter( routes=[ Route(name="weather", examples=["What's the weather?", "Is it raining today?"]), Route(name="booking", examples=["Book a flight", "Reserve a hotel room"]), ], embed_fn=embed_fn, min_confidence=0.7, ) await router.compile() # Must be called before route() match: RouteMatch = await router.route("Will it snow tomorrow?") print(match.route) # "weather" print(match.confidence) # float, e.g. 0.91 print(match.matched) # bool — False if below min_confidence ``` **Key invariant**: `compile()` must be awaited before any `route()` call; calling `route()` before `compile()` raises `RouterNotCompiledError`. --- ## Section 40 — Cost & Rate Tracking ```python from lauren_ai import ( CostTracker, TokenBudget, RateLimiter, PricingTable, ModelPricing, CostEstimate, default_pricing_table, BudgetExceededError, RateLimitExhaustedError, ) # Pricing table table: PricingTable = default_pricing_table() custom_table = PricingTable(models={ "claude-opus-4-6": ModelPricing(input_per_1k=0.015, output_per_1k=0.075), }) # Cost tracker (injectable singleton) tracker = CostTracker(pricing=table) async with tracker.session() as session: result = await agent_runner.run(agent_instance, "Hello!") estimate: CostEstimate = session.estimate # cost so far report = await tracker.report() print(report.total_usd, report.total_tokens) # Token budget per run budget = TokenBudget(max_tokens_per_conversation=100_000) # Raises BudgetExceededError when limit exceeded # Rate limiter limiter = RateLimiter(requests_per_minute=60) await limiter.acquire() # Raises RateLimitExhaustedError when ceiling breached ``` **Key invariant**: `CostTracker` is safe to inject as `Scope.SINGLETON`; its internal counters are protected by an asyncio lock. --- ## Section 41 — Guardrails & Content Safety Two decorators are provided: | Decorator | Purpose | |---|---| | `@use_guardrails()` | **Agent decorator** — attaches pre-built guardrail instances to an `@agent()` class. | | `@guardrail()` | **Class decorator** — marks a class as a DI-injectable guardrail provider. Applies `@injectable(scope=Scope.SINGLETON)` automatically. | ```python from lauren_ai import ( use_guardrails, guardrail, TopicFilter, PIIRedactor, LengthFilter, PromptInjectionFilter, LLMGuardrail, GuardrailDecision, GuardrailContext, GuardrailViolated, InputGuardrail, OutputGuardrail, GUARDRAIL_CLASS_META, GuardrailClassMeta, USE_GUARDRAILS_META, UseGuardrailsMeta, ) # --- @use_guardrails() — attach instances to an agent --- @agent(model="claude-opus-4-6", system="Customer support assistant.") @use_guardrails( input=[ TopicFilter(allowed_topics=["billing", "support", "account"]), PIIRedactor(entities=["EMAIL", "PHONE", "SSN"]), PromptInjectionFilter(), ], output=[ LengthFilter(max_chars=500), ], ) class SupportAgent: ... # None entries are silently dropped (conditional selection): @agent(model="claude-opus-4-6") @use_guardrails( input=[ PromptInjectionFilter(), TopicFilter(allowed_topics=allowed) if allowed else None, ], ) class DynamicAgent: ... # --- @guardrail() — DI-injectable guardrail class --- @guardrail(kind="input") # also marks as @injectable(scope=Scope.SINGLETON) class ProfanityFilter: async def check(self, message: str, ctx: GuardrailContext) -> GuardrailDecision: if "badword" in message.lower(): return GuardrailDecision( action="block", violation="Profanity detected.", guardrail_name=type(self).__name__, ) return GuardrailDecision(action="pass", guardrail_name=type(self).__name__) # kind may be "input", "output", or "any" (default) # --- LLMGuardrail — use another LLM to evaluate safety --- # Basic usage (action="block" by default): llm_guard = LLMGuardrail( llm=llm_service, prompt="Does the following text contain harmful content? Reply YES or NO.\n\n{content}", block_if="YES", ) # Extended parameters — graceful redirect + cost-efficient judge call: llm_guard = LLMGuardrail( llm=llm_service, prompt="Is this response outside the agent's scope?\n\n{content}\n\nYES or NO.", block_if="YES", action="modify", # replaces response instead of raising violation_message="I can't help with that. Redirecting you.", system="Answer with YES or NO only.", # judge system prompt max_tokens=5, # YES/NO needs ≤1 token temperature=0.0, # deterministic guardrail_name="ScopeGuard", # shown in activity feeds ) # --- GuardrailDecision --- decision = GuardrailDecision( action="modify", # "pass" | "block" | "modify" modified_content="[REDACTED]", violation="PII detected: email address", guardrail_name="PIIRedactor", ) ``` **Key invariants**: - `@use_guardrails()` must use parentheses; bare `@use_guardrails` raises `DecoratorUsageError`. - `@guardrail()` must use parentheses; bare `@guardrail` raises `DecoratorUsageError`. - Input guardrails run before the model call; output guardrails run after. - A `"block"` decision raises `GuardrailViolated` (a `LaurenAIError`). - `@guardrail()` checks for an existing `__lauren_injectable__` sentinel and skips re-applying `@injectable()` if already present (idempotent). --- ## Section 42 — Delegation Pattern (cross-module tool routing) The recommended way to give an orchestrator agent access to a specialist sub-agent via a class-form `@tool()`. ### How the wiring works - The delegation tool lives in the **calling module's `tools=`**. - The calling module imports the target module (`imports=[..., SpecialistMod]`), which makes the target runner token visible to DI when resolving the delegation tool. - The delegation tool's `__init__` uses the **named concrete runner subclass** (not the `AgentRunner` Protocol) — this is unambiguous even when two runners are in scope. No `DelegationWiring` singleton, no `runner: AgentRunner | None = None` workaround. ### Step 1 — define the named runner tokens and the delegation tool ```python # delegation.py — do NOT add `from __future__ import annotations` # (@tool() uses inspect.signature() at decoration time) from lauren import injectable, Scope from lauren_ai import AgentRunnerBase, ToolContext, tool from .specialist_agent import SpecialistAgent @injectable(scope=Scope.SINGLETON) class SpecialistAgentRunner(AgentRunnerBase): """Named DI token for the Specialist module's runner. Using the named subclass (not ``AgentRunner`` Protocol) avoids ``ProtocolAmbiguityError`` in any scope that sees both runners. """ @injectable(scope=Scope.SINGLETON) class OrchestratorAgentRunner(AgentRunnerBase): """Named DI token for the Orchestrator module's runner.""" @tool() class DelegateToSpecialist: """Delegate a task to the SpecialistAgent. Args: task: Full description of what the Specialist Agent should do. """ def __init__(self, agent: SpecialistAgent, runner: SpecialistAgentRunner) -> None: self._agent = agent self._runner = runner # named subclass — no ambiguity async def run(self, ctx: ToolContext, task: str) -> dict: response = await self._runner.run( self._agent, task, execution_context=ctx.execution_context, ) return {"result": response.content, "stop_reason": response.stop_reason} ``` ### Step 2 — target module registers its named runner token ```python # specialist_module.py from lauren_ai import AgentModule, LLMModule, LLMConfig from .specialist_agent import SpecialistAgent from .delegation import SpecialistAgentRunner LLMProvider = LLMModule.for_root(LLMConfig.for_anthropic()) SpecialistMod = AgentModule.for_root( agents=[SpecialistAgent], tools=[SpecialistTool1, SpecialistTool2], imports=[LLMProvider], runner=SpecialistAgentRunner, # registers SpecialistAgentRunner as the runner token ) ``` ### Step 3 — calling module owns the delegation tool and imports the target module ```python # orchestrator_module.py from lauren_ai import AgentModule from .orchestrator_agent import OrchestratorAgent from .delegation import DelegateToSpecialist, OrchestratorAgentRunner from .specialist_module import SpecialistMod OrchestratorMod = AgentModule.for_root( agents=[OrchestratorAgent], tools=[DelegateToSpecialist], # ← delegation tool lives in the CALLING module imports=[LLMProvider, SpecialistMod], # ← import makes SpecialistAgentRunner visible to DI runner=OrchestratorAgentRunner, ) # A controller in the outer module can inject both runners unambiguously: class AppController: def __init__( self, runner: OrchestratorAgentRunner, specialist_runner: SpecialistAgentRunner, ) -> None: ... ``` ### Architectural rule > The delegation tool belongs in the **calling module's `tools=`**. The calling > module imports the target module so the target runner token is visible. The tool's > `__init__` must use the named concrete runner subclass — not `AgentRunner` > Protocol — because two runners are in scope and Protocol scan would be ambiguous. > > Every `AgentModule.for_root()` call MUST have its own dedicated runner token. > Use `runner=MyRunner` with an explicit `AgentRunnerBase` subclass whenever > a controller, service, or tool needs to inject a specific module's runner. ### `AgentRunner.run()` accurate signature **Critical invariant — pass an instance, not a class.** `@agent()` applies `@injectable(scope=Scope.SINGLETON)`. `AgentModule.for_root()` registers and **exports** each agent so DI injects an instance into controllers. Passing the class itself bypasses DI and raises `TypeError` on lifecycle hooks: ```python # WRONG — passes the class, not an instance; breaks on_start / on_finish hooks response = await runner.run(MyAgent, "message") # CORRECT — inject the agent via constructor DI, then pass the instance. # runner: AgentRunner resolves to this module's runner via structural Protocol scan # (unambiguous when only one AgentModule is in scope; use a named runner subclass # when two or more AgentModules are visible — see Multi-module runner disambiguation). class MyController: def __init__(self, runner: AgentRunner, agent: MyAgent) -> None: self._runner = runner self._agent = agent # DI-resolved singleton async def handle(self, message: str) -> str: response = await self._runner.run(self._agent, message) return response.content ``` ```python # CORRECT response = await runner.run(agent_instance, "User message here") # CORRECT with optional kwargs response = await runner.run( agent_instance, "User message", conversation_id="sess-123", metadata={"user_id": "u1"}, request=http_request, # optional Lauren Request execution_context=exec_ctx, # optional lauren ExecutionContext (forwarded to ToolContext) run_id="run-abc", # optional — random hex if omitted ) ``` --- ## Section 43 — @team() Accurate API ```python from lauren_ai import team, TeamRunner @team( name="research-team", mode="coordinator", # "coordinator" | "collaborate" model="claude-opus-4-6", # model for coordinator decisions max_rounds=4, coordinator_prompt=None, # Optional: override default routing prompt ) class ResearchTeam: """Multi-agent team for research tasks.""" def __init__( self, researcher: ResearchAgent, code_assistant: CodeAssistantAgent, ) -> None: self.researcher = researcher self.code_assistant = code_assistant # TeamRunner constructor runner = TeamRunner( team_cls=ResearchTeam, llm=llm_service, # LLMService agent_runner=agent_runner, # shared AgentRunner ) # Blocking run result = await runner.run("Research quantum computing breakthroughs.") print(result.final_answer) print(result.worker_outputs) # {"researcher": "...", "code_assistant": "..."} print(result.rounds) # Streaming run async for event in runner.run_stream("Research topic"): if isinstance(event, TeamWorkerStarted): print(f"[R{event.round}] Starting: {event.worker_name}") elif isinstance(event, TeamWorkerFinished): print(f"[R{event.round}] Done: {event.worker_name} — {event.result_content[:80]}") elif isinstance(event, TeamCoordinatorDecision): print(f"[Coordinator] {event.decision}") elif isinstance(event, TeamFinalAnswer): print(f"[Final] {event.content}") ``` **Default coordinator prompt format**: ``` ROUTE: — route to a worker DONE: — declare task complete ``` **Key invariants**: - `@team()` must use parentheses. - Worker parameters in `__init__` must have agent-class type annotations. - `run_stream()` is an async generator; iterate with `async for`. --- ## Decorator ordering — mandatory summary ``` @agent() ← outermost decorator (sets AGENT_META, reads USE_TOOLS_META) @remember() ← optional (sets REMEMBER_META) @use_guardrails() ← optional (sets USE_GUARDRAILS_META) @use_tools() ← innermost (sets USE_TOOLS_META) class MyAgent: ... ``` Python applies decorators bottom-up, so `@use_tools` runs first and sets `USE_TOOLS_META`; `@agent` runs last and reads it. Swapping any two decorators produces silently broken behaviour (missing tools, missing guardrails, etc.). --- ## Critical invariant — tool annotations must resolve ```python # WRONG — unresolved forward refs still break schema generation: from __future__ import annotations @tool() async def my_tool(param: LaterType) -> dict: ... # LaterType is not defined yet # CORRECT — future annotations are fine when every type resolves: from __future__ import annotations from my_types import ToolInput @tool() async def my_tool(param: ToolInput) -> dict: ... # ALSO CORRECT — class-form tools can use the same pattern: from __future__ import annotations @tool() class MyClassTool: def __init__(self, dep: SomeDependency, runner: AgentRunner | None = None) -> None: ... async def run(self, param: str) -> dict: ... ``` --- ## Links - Documentation: https://docs.lauren-framework.dev/lauren-ai/ - GitHub: https://github.com/lauren-framework/lauren-ai - PyPI: https://pypi.org/project/lauren-ai/ - Changelog: https://github.com/lauren-framework/lauren-ai/blob/main/CHANGELOG.md