# lauren-ai > First-party AI/LLM companion for the Lauren web framework. `lauren-ai` integrates large-language-model completions, agentic loops, tool execution, memory management, and structured workflows into Lauren applications using the same decorator-first, DI-driven, module-scoped paradigm as Lauren itself. ## Installation ```bash pip install "lauren-ai[anthropic]" # Anthropic Claude (default) pip install "lauren-ai[openai]" # OpenAI / Ollama pip install "lauren-ai[all]" # All providers + all extras ``` ## Core Concepts ### LLMConfig — provider connection ```python from lauren_ai import LLMConfig # Direct construction (provider is required) config = LLMConfig( provider="anthropic", # "anthropic" | "openai" | "ollama" | "litellm" model="claude-opus-4-6", # Model ID api_key="sk-ant-...", # API key (or reads env var) max_tokens=4096, # Max output tokens (default: 4096) temperature=1.0, # Sampling temperature (default: 1.0) timeout=60.0, # Request timeout in seconds (default: 60.0) max_retries=3, # Retries on transient errors (default: 3) ) # Convenience factories (recommended) config = LLMConfig.for_anthropic(model="claude-opus-4-6") # reads ANTHROPIC_API_KEY config = LLMConfig.for_openai(model="gpt-4o") # reads OPENAI_API_KEY config = LLMConfig.for_ollama(model="llama3.2") # no key needed # Test factory (zero network calls) cfg, mock = LLMConfig.for_testing() # returns (LLMConfig, MockTransport) ``` ### AgentConfig — runtime behaviour ```python from lauren_ai import AgentConfig config = AgentConfig( system_prompt="You are a helpful assistant.", # Default system prompt max_turns=10, # Max agentic loop turns (default: 10) max_tokens_per_turn=4096, # Max output tokens per turn (default: 4096) temperature=1.0, # Per-agent temperature override (default: 1.0) memory_window_tokens=40_000, # Sliding memory window (default: 40_000) max_cost_usd=None, # USD budget cap per run (None = unlimited) parallel_tool_calls=False, # Execute tools in parallel (default: False) tool_error_policy="return_error", # "return_error" | "raise" | "skip" # Extended thinking (Anthropic only) thinking=False, thinking_budget_tokens=8_000, # OpenAI reasoning models reasoning_effort=None, # "low" | "medium" | "high" | None ) ``` ### LLMModule — wiring into Lauren ```python from lauren_ai import LLMModule, LLMConfig LLMProvider = LLMModule.for_root(LLMConfig.for_anthropic()) ``` ### @tool() — define a tool for an agent ```python from lauren_ai import tool, ToolContext @tool() async def get_weather(city: str, ctx: ToolContext | None = None) -> dict: """Return current weather for a city. Args: city: City name. """ return {"city": city, "temperature": 22, "condition": "sunny"} ``` Rules: - Must be invoked with parentheses: `@tool()` not `@tool` - Schema is generated from type annotations + Google-style `Args:` docstring section - Any parameter annotated as `ToolContext` (or `ToolContext | None`) is injected by the runner and excluded from the JSON schema — the parameter may have **any name** - `from __future__ import annotations` is supported, but tool annotations still need to resolve when `@tool()` builds the schema. Avoid unresolved forward refs or circular imports in function-form tool files. - Class-form tools get `@injectable(scope=SINGLETON)` applied automatically; use `@tool() @injectable(scope=Scope.REQUEST)` to override the scope - Subclasses of `@tool()`-decorated classes must re-apply `@tool()` themselves; the registry raises `MetadataInheritanceError` otherwise ### @agent() + @use_tools() — declare an agent ```python from lauren_ai import agent, use_tools @agent(system="You are a helpful assistant.") @use_tools(get_weather) class WeatherAgent: ... ``` - `@agent()` is the **outermost** decorator; `@use_tools()` is below it (applied first) - Mandatory ordering: `@agent` → `@remember` → `@use_guardrails` → `@use_tools` - Agents are singletons in the DI container by default ### AgentRunner (Protocol) / AgentRunnerBase (concrete) `AgentRunner` is a `@runtime_checkable Protocol`. The concrete implementation is `AgentRunnerBase`. In production, runners are created by `AgentModule.for_root()`. For testing, instantiate `AgentRunnerBase` directly: ```python from lauren_ai import AgentRunner, AgentRunnerBase runner = AgentRunnerBase(transport=transport) isinstance(runner, AgentRunner) # True response = await runner.run( agent_instance, # @agent()-decorated class instance "What's the weather?", # User message string conversation_id="sess-1", metadata={"user_id": "u1"}, execution_context=exec_ctx, # lauren ExecutionContext (optional, from route handler) ) print(response.content) # str — final text output print(response.turns) # int — agentic loop iterations print(response.stop_reason) # "end_turn" | "max_turns" | "budget_exceeded" | "error" print(response.total_usage) # TokenUsage — cumulative tokens print(response.tool_calls_made) # list[ToolCall] print(response.reasoning_traces) # list[str] — extended thinking traces (Anthropic) ``` **Rule: every `AgentModule` MUST have its own runner.** Each `AgentModule.for_root()` call auto-generates a unique `AgentRunnerBase` subclass and registers `AgentRunner[agent_cls]` for every agent in `agents=`. - **Single module:** inject with `runner: AgentRunner` (structural Protocol scan). - **Multiple modules:** inject with `runner: AgentRunner[MyAgent]` — a cached real subclass used as an unambiguous DI token. No named `AgentRunnerBase` subclass boilerplate needed. ```python from lauren_ai import AgentRunner class BankingController: def __init__( self, crm_runner: AgentRunner[CRMAgent], transfer_runner: AgentRunner[TransferAgent], ) -> None: ... ``` Both `@agent(memory=ShortTermMemory(...))` and `@agent(conversation_store=InMemoryConversationStore())` declare per-agent state. `AgentModule.for_root()` **no longer accepts** `memory=` or `conversation_store=` at the module level. If an agent omits `model=...`, `AgentModule.for_root()` fills in the model from the imported `LLMConfig` during module wiring. ### @use_guardrails() — attach guardrail instances to an agent ```python from lauren_ai import use_guardrails, PromptInjectionFilter, PIIRedactor, LengthFilter, TopicFilter @agent(model="claude-opus-4-6") @use_guardrails( input=[PromptInjectionFilter(), PIIRedactor()], output=[LengthFilter(max_chars=8000)], ) @use_tools(my_tool) class SafeAgent: ... ``` Built-ins: `PromptInjectionFilter`, `PIIRedactor`, `LengthFilter`, `TopicFilter`, `LLMGuardrail`. ### @guardrail() — DI-injectable guardrail class ```python from lauren_ai import guardrail, GuardrailDecision, GuardrailContext @guardrail(kind="input") # also marks as @injectable(scope=Scope.SINGLETON) class ProfanityFilter: async def check(self, message: str, ctx: GuardrailContext) -> GuardrailDecision: if "badword" in message.lower(): return GuardrailDecision(action="block", violation="Profanity", guardrail_name="ProfanityFilter") return GuardrailDecision(action="pass", guardrail_name="ProfanityFilter") ``` `kind` may be `"input"`, `"output"`, or `"any"` (default). ### @remember() — persistent user memory ```python from lauren_ai import remember, InMemoryUserMemoryStore _memory = InMemoryUserMemoryStore() @agent(model="claude-opus-4-6") @remember(store=_memory, extract=True, inject=True, top_k=5) @use_guardrails(input=[PromptInjectionFilter()]) @use_tools(my_tool) class PersonalAgent: ... ``` - Stack between `@agent()` and `@use_guardrails()` - `extract=True`: facts extracted from each turn and stored - `inject=True`: top-k relevant facts prepended to system prompt ### LLMService — low-level completions ```python from lauren_ai import LLMService, Message class MyController: def __init__(self, llm: LLMService) -> None: self._llm = llm @get("/complete") async def complete(self) -> dict: result = await self._llm.complete( [Message.user("Say hello!")] ) return {"text": result.content} ``` ### ToolContext — injected run context ```python from lauren_ai import ToolContext # Any parameter name works — annotation determines injection: @tool() async def my_tool(query: str, ctx: ToolContext | None = None) -> dict: """...""" # ctx.agent_context — AgentContext for the running agent # ctx.tool_use_id — Provider-assigned tool call ID # ctx.turn — Current agentic loop iteration (0-based) # ctx.request — Originating HTTP request (or None) # ctx.execution_context — lauren ExecutionContext (route/handler metadata, or None) # ctx.state — Mutable dict for per-call local storage # ctx.get_metadata("key") — Read from agent_context.metadata return {"result": query} ``` ### Memory ```python from lauren_ai import ShortTermMemory, InMemoryVectorStore, InMemoryConversationStore mem = ShortTermMemory(max_tokens=40_000) mem.add_user("Hello!") store = InMemoryVectorStore() conv = InMemoryConversationStore() ``` ### Signals — observability hooks ```python from lauren_ai import SignalBus, ModelCallComplete, AgentRunComplete bus = SignalBus() @bus.on(ModelCallComplete) async def on_complete(event: ModelCallComplete) -> None: print(f"Cost: ${event.cost_usd:.6f}, tokens: {event.usage.total_tokens}") @bus.on(AgentRunComplete) async def on_run_done(event: AgentRunComplete) -> None: print(f"Agent {event.agent_class.__name__} — turns={event.turns}, " f"total_cost=${event.total_cost_usd:.6f}") ``` ### @team() — multi-agent teams ```python from lauren_ai import team, TeamRunner, TeamResult @team(name="research", mode="coordinator", max_rounds=4, model="openai/gpt-4o-mini") class ResearchTeam: def __init__(self, researcher: ResearchAgent, writer: WriterAgent) -> None: self.researcher = researcher self.writer = writer runner = TeamRunner(team_cls=ResearchTeam, llm=llm_service, agent_runner=agent_runner) result: TeamResult = await runner.run("Summarise AI safety research.") print(result.final_answer) print(result.worker_outputs) # {"researcher": "...", "writer": "..."} print(result.rounds) ``` Stream events via `runner.run_stream(...)`: `TeamWorkerStarted`, `TeamWorkerFinished`, `TeamCoordinatorDecision`, `TeamFinalAnswer`. ### Delegation pattern — cross-module tool routing The delegation tool lives in the **calling module's `tools=`**. The calling module imports the target module so `AgentRunner[TargetAgent]` is visible to DI. No named `AgentRunnerBase` subclass needed. ```python from lauren_ai import AgentRunner, tool, ToolContext @tool() class DelegateToSpecialist: """Delegate a task to the SpecialistAgent. Args: task: Full description of the task. """ def __init__( self, agent: SpecialistAgent, runner: AgentRunner[SpecialistAgent], # ← parameterized token — no boilerplate ) -> None: self._agent = agent self._runner = runner async def run(self, ctx: ToolContext, task: str) -> dict: response = await self._runner.run(self._agent, task, execution_context=ctx.execution_context) return {"result": response.content} # Target module: AgentRunner[SpecialistAgent] is auto-registered SpecialistMod = AgentModule.for_root( agents=[SpecialistAgent], tools=[SpecialistTool1], imports=[LLMProvider], ) # Calling module: owns the delegation tool; imports target module so # AgentRunner[SpecialistAgent] is visible to DI when resolving DelegateToSpecialist OrchestratorMod = AgentModule.for_root( agents=[OrchestratorAgent], tools=[DelegateToSpecialist], # ← delegation tool lives HERE imports=[LLMProvider, SpecialistMod], # ← makes AgentRunner[SpecialistAgent] visible ) ``` **Rule:** The delegation tool belongs in the **calling module's `tools=`**. The calling module imports the target module so the target runner token is visible. ### @traced() — tracing & observability ```python from lauren_ai import traced, SpanKind, TraceStore, InMemoryTraceExporter, set_trace_store exporter = InMemoryTraceExporter() store = TraceStore() store._exporters = [exporter] set_trace_store(store) @traced(name="my_handler", kind=SpanKind.AGENT) async def my_handler(request): ... ``` ### Cost & Rate Tracking ```python from lauren_ai import CostTracker, TokenBudget, RateLimiter, default_pricing_table tracker = CostTracker(pricing=default_pricing_table()) budget = TokenBudget(max_tokens_per_conversation=100_000) # Raises BudgetExceededError when token limit hit limiter = RateLimiter(requests_per_minute=60) await limiter.acquire() # Raises RateLimitExhaustedError when ceiling breached ``` ### StructuredLLM — typed LLM outputs ```python from lauren_ai import StructuredLLM, LLMService, Message from pydantic import BaseModel class Sentiment(BaseModel): label: str score: float structured = llm_service.with_structured_output(Sentiment) result: Sentiment = await structured.complete([Message.user("Great product!")]) ``` ### Prompt Templates & Chains ```python from lauren_ai import PromptTemplate, ChatPromptTemplate, Chain, StrOutputParser tmpl = PromptTemplate("Summarise: {text}") chain = tmpl | llm_service | StrOutputParser() result = await chain.invoke({"text": "Some long document..."}) ``` ### Output Parsers ```python from lauren_ai import StrOutputParser, JSONOutputParser, PydanticOutputParser parser = PydanticOutputParser(MyModel) chain = prompt | llm_service | parser ``` Available: `StrOutputParser`, `JSONOutputParser`, `RegexParser`, `CommaSeparatedListParser`, `MarkdownCodeBlockParser`, `PydanticOutputParser`, `RetryOutputParser`. ### SemanticRouter ```python from lauren_ai import SemanticRouter, Route, RouteMatch router = SemanticRouter( routes=[Route(name="weather", examples=["What's the weather?", "Is it raining?"])], embed_fn=embed_fn, min_confidence=0.7, ) await router.compile() match: RouteMatch = await router.route("Will it snow today?") ``` ### Multimodal inputs ```python from lauren_ai import ImageContent, AudioContent, DocumentContent, Message msg = Message.from_multimodal("user", [ "Describe this image:", ImageContent(url="https://example.com/img.png"), ]) ``` ## Decorator quick-reference | Decorator | Purpose | Parentheses required | |-----------|---------|---------------------| | `@tool()` | Expose a function or class as an agent tool | Yes | | `@agent()` | Declare an agent class | Yes | | `@use_tools(t1, t2)` | Attach tools to an agent | Yes | | `@use_knowledge_sources(ks, ...)` | Opt agent into specific KB sources (opt-in — no decorator = no KB tools) | Yes | | `@use_guardrails(...)` | Attach guardrail instances to an agent | Yes | | `@guardrail()` | Mark a class as a DI-injectable guardrail | Yes | | `@remember()` | Enable persistent user memory | Yes | | `@team()` | Declare a multi-agent team | Yes | | `@traced()` | Wrap with tracing spans | Yes | ## Decorator ordering (mandatory) ``` @agent() ← outermost @remember() ← optional @use_guardrails() ← optional @use_knowledge_sources() ← optional (KB opt-in; must be below @agent) @use_tools() ← innermost class Agent: ... ``` ## Critical rule — tool annotations must resolve `from __future__ import annotations` is supported in both agent and tool files, but `@tool()` still resolves parameter types when the module is imported. Keep tool annotations importable and avoid unresolved forward references or circular imports, especially in function-form tools. ## Links - Documentation: https://docs.lauren-framework.dev/lauren-ai/ - GitHub: https://github.com/lauren-framework/lauren-ai - PyPI: https://pypi.org/project/lauren-ai/ - Changelog: https://github.com/lauren-framework/lauren-ai/blob/main/CHANGELOG.md