# lauren-ai

> First-party AI/LLM companion for the Lauren web framework.

`lauren-ai` integrates large-language-model completions, agentic loops, tool
execution, memory management, and structured workflows into Lauren applications
using the same decorator-first, DI-driven, module-scoped paradigm as Lauren itself.

## Installation

```bash
pip install "lauren-ai[anthropic]"   # Anthropic Claude (default)
pip install "lauren-ai[openai]"      # OpenAI / Ollama
pip install "lauren-ai[all]"         # All providers + all extras
```

## Core Concepts

### LLMConfig — provider connection

```python
from lauren_ai import LLMConfig

# Direct construction (provider is required)
config = LLMConfig(
    provider="anthropic",        # "anthropic" | "openai" | "ollama" | "litellm"
    model="claude-opus-4-6",     # Model ID
    api_key="sk-ant-...",        # API key (or reads env var)
    max_tokens=4096,             # Max output tokens (default: 4096)
    temperature=1.0,             # Sampling temperature (default: 1.0)
    timeout=60.0,                # Request timeout in seconds (default: 60.0)
    max_retries=3,               # Retries on transient errors (default: 3)
)

# Convenience factories (recommended)
config = LLMConfig.for_anthropic(model="claude-opus-4-6")   # reads ANTHROPIC_API_KEY
config = LLMConfig.for_openai(model="gpt-4o")               # reads OPENAI_API_KEY
config = LLMConfig.for_ollama(model="llama3.2")             # no key needed

# Test factory (zero network calls)
cfg, mock = LLMConfig.for_testing()                         # returns (LLMConfig, MockTransport)
```

### AgentConfig — runtime behaviour

```python
from lauren_ai import AgentConfig

config = AgentConfig(
    system_prompt="You are a helpful assistant.",  # Default system prompt
    max_turns=10,                                  # Max agentic loop turns (default: 10)
    max_tokens_per_turn=4096,                      # Max output tokens per turn (default: 4096)
    temperature=1.0,                               # Per-agent temperature override (default: 1.0)
    memory_window_tokens=40_000,                   # Sliding memory window (default: 40_000)
    max_cost_usd=None,                             # USD budget cap per run (None = unlimited)
    parallel_tool_calls=False,                     # Execute tools in parallel (default: False)
    tool_error_policy="return_error",              # "return_error" | "raise" | "skip"
    # Extended thinking (Anthropic only)
    thinking=False,
    thinking_budget_tokens=8_000,
    # OpenAI reasoning models
    reasoning_effort=None,                         # "low" | "medium" | "high" | None
)
```

### LLMModule — wiring into Lauren

```python
from lauren_ai import LLMModule, LLMConfig

LLMProvider = LLMModule.for_root(LLMConfig.for_anthropic())
```

### @tool() — define a tool for an agent

```python
from lauren_ai import tool, ToolContext

@tool()
async def get_weather(city: str, ctx: ToolContext | None = None) -> dict:
    """Return current weather for a city.

    Args:
        city: City name.
    """
    return {"city": city, "temperature": 22, "condition": "sunny"}
```

Rules:
- Must be invoked with parentheses: `@tool()` not `@tool`
- Schema is generated from type annotations + Google-style `Args:` docstring section
- Any parameter annotated as `ToolContext` (or `ToolContext | None`) is injected by
  the runner and excluded from the JSON schema — the parameter may have **any name**
- `from __future__ import annotations` is supported, but tool annotations still
  need to resolve when `@tool()` builds the schema. Avoid unresolved forward
  refs or circular imports in function-form tool files.
- Class-form tools get `@injectable(scope=SINGLETON)` applied automatically; use
  `@tool() @injectable(scope=Scope.REQUEST)` to override the scope
- Subclasses of `@tool()`-decorated classes must re-apply `@tool()` themselves;
  the registry raises `MetadataInheritanceError` otherwise

### @agent() + @use_tools() — declare an agent

```python
from lauren_ai import agent, use_tools

@agent(system="You are a helpful assistant.")
@use_tools(get_weather)
class WeatherAgent: ...
```

- `@agent()` is the **outermost** decorator; `@use_tools()` is below it (applied first)
- Mandatory ordering: `@agent` → `@remember` → `@use_guardrails` → `@use_tools`
- Agents are singletons in the DI container by default

### AgentRunner (Protocol) / AgentRunnerBase (concrete)

`AgentRunner` is a `@runtime_checkable Protocol`. The concrete implementation is
`AgentRunnerBase`. In production, runners are created by `AgentModule.for_root()`.
For testing, instantiate `AgentRunnerBase` directly:

```python
from lauren_ai import AgentRunner, AgentRunnerBase

runner = AgentRunnerBase(transport=transport)
isinstance(runner, AgentRunner)  # True

response = await runner.run(
    agent_instance,              # @agent()-decorated class instance
    "What's the weather?",       # User message string
    conversation_id="sess-1",
    metadata={"user_id": "u1"},
    execution_context=exec_ctx,  # lauren ExecutionContext (optional, from route handler)
)

print(response.content)       # str — final text output
print(response.turns)         # int — agentic loop iterations
print(response.stop_reason)   # "end_turn" | "max_turns" | "budget_exceeded" | "error"
print(response.total_usage)   # TokenUsage — cumulative tokens
print(response.tool_calls_made)  # list[ToolCall]
print(response.reasoning_traces) # list[str] — extended thinking traces (Anthropic)
```

**Rule: every `AgentModule` MUST have its own runner.**
Each `AgentModule.for_root()` call auto-generates a unique `AgentRunnerBase`
subclass and registers `AgentRunner[agent_cls]` for every agent in `agents=`.

- **Single module:** inject with `runner: AgentRunner` (structural Protocol scan).
- **Multiple modules:** inject with `runner: AgentRunner[MyAgent]` — a cached real
  subclass used as an unambiguous DI token.  No named `AgentRunnerBase` subclass
  boilerplate needed.

```python
from lauren_ai import AgentRunner

class BankingController:
    def __init__(
        self,
        crm_runner:      AgentRunner[CRMAgent],
        transfer_runner: AgentRunner[TransferAgent],
    ) -> None: ...
```

Both `@agent(memory=ShortTermMemory(...))` and `@agent(conversation_store=InMemoryConversationStore())`
declare per-agent state.  `AgentModule.for_root()` **no longer accepts** `memory=` or
`conversation_store=` at the module level.

If an agent omits `model=...`, `AgentModule.for_root()` fills in the model from the
imported `LLMConfig` during module wiring.

### @use_guardrails() — attach guardrail instances to an agent

```python
from lauren_ai import use_guardrails, PromptInjectionFilter, PIIRedactor, LengthFilter, TopicFilter

@agent(model="claude-opus-4-6")
@use_guardrails(
    input=[PromptInjectionFilter(), PIIRedactor()],
    output=[LengthFilter(max_chars=8000)],
)
@use_tools(my_tool)
class SafeAgent: ...
```

Built-ins: `PromptInjectionFilter`, `PIIRedactor`, `LengthFilter`, `TopicFilter`, `LLMGuardrail`.

### @guardrail() — DI-injectable guardrail class

```python
from lauren_ai import guardrail, GuardrailDecision, GuardrailContext

@guardrail(kind="input")   # also marks as @injectable(scope=Scope.SINGLETON)
class ProfanityFilter:
    async def check(self, message: str, ctx: GuardrailContext) -> GuardrailDecision:
        if "badword" in message.lower():
            return GuardrailDecision(action="block", violation="Profanity",
                                     guardrail_name="ProfanityFilter")
        return GuardrailDecision(action="pass", guardrail_name="ProfanityFilter")
```

`kind` may be `"input"`, `"output"`, or `"any"` (default).

### @remember() — persistent user memory

```python
from lauren_ai import remember, InMemoryUserMemoryStore

_memory = InMemoryUserMemoryStore()

@agent(model="claude-opus-4-6")
@remember(store=_memory, extract=True, inject=True, top_k=5)
@use_guardrails(input=[PromptInjectionFilter()])
@use_tools(my_tool)
class PersonalAgent: ...
```

- Stack between `@agent()` and `@use_guardrails()`
- `extract=True`: facts extracted from each turn and stored
- `inject=True`: top-k relevant facts prepended to system prompt

### LLMService — low-level completions

```python
from lauren_ai import LLMService, Message

class MyController:
    def __init__(self, llm: LLMService) -> None:
        self._llm = llm

    @get("/complete")
    async def complete(self) -> dict:
        result = await self._llm.complete(
            [Message.user("Say hello!")]
        )
        return {"text": result.content}
```

### ToolContext — injected run context

```python
from lauren_ai import ToolContext

# Any parameter name works — annotation determines injection:
@tool()
async def my_tool(query: str, ctx: ToolContext | None = None) -> dict:
    """..."""
    # ctx.agent_context      — AgentContext for the running agent
    # ctx.tool_use_id        — Provider-assigned tool call ID
    # ctx.turn               — Current agentic loop iteration (0-based)
    # ctx.request            — Originating HTTP request (or None)
    # ctx.execution_context  — lauren ExecutionContext (route/handler metadata, or None)
    # ctx.state              — Mutable dict for per-call local storage
    # ctx.get_metadata("key")  — Read from agent_context.metadata
    return {"result": query}
```

### Memory

```python
from lauren_ai import ShortTermMemory, InMemoryVectorStore, InMemoryConversationStore

mem = ShortTermMemory(max_tokens=40_000)
mem.add_user("Hello!")

store = InMemoryVectorStore()
conv = InMemoryConversationStore()
```

### Signals — observability hooks

```python
from lauren_ai import SignalBus, ModelCallComplete, AgentRunComplete

bus = SignalBus()

@bus.on(ModelCallComplete)
async def on_complete(event: ModelCallComplete) -> None:
    print(f"Cost: ${event.cost_usd:.6f}, tokens: {event.usage.total_tokens}")

@bus.on(AgentRunComplete)
async def on_run_done(event: AgentRunComplete) -> None:
    print(f"Agent {event.agent_class.__name__} — turns={event.turns}, "
          f"total_cost=${event.total_cost_usd:.6f}")
```

### @team() — multi-agent teams

```python
from lauren_ai import team, TeamRunner, TeamResult

@team(name="research", mode="coordinator", max_rounds=4, model="openai/gpt-4o-mini")
class ResearchTeam:
    def __init__(self, researcher: ResearchAgent, writer: WriterAgent) -> None:
        self.researcher = researcher
        self.writer = writer

runner = TeamRunner(team_cls=ResearchTeam, llm=llm_service, agent_runner=agent_runner)
result: TeamResult = await runner.run("Summarise AI safety research.")
print(result.final_answer)
print(result.worker_outputs)  # {"researcher": "...", "writer": "..."}
print(result.rounds)
```

Stream events via `runner.run_stream(...)`:
`TeamWorkerStarted`, `TeamWorkerFinished`, `TeamCoordinatorDecision`, `TeamFinalAnswer`.

### Delegation pattern — cross-module tool routing

The delegation tool lives in the **calling module's `tools=`**. The calling module
imports the target module so `AgentRunner[TargetAgent]` is visible to DI.
No named `AgentRunnerBase` subclass needed.

```python

from lauren_ai import AgentRunner, tool, ToolContext

@tool()
class DelegateToSpecialist:
    """Delegate a task to the SpecialistAgent.

    Args:
        task: Full description of the task.
    """
    def __init__(
        self,
        agent: SpecialistAgent,
        runner: AgentRunner[SpecialistAgent],   # ← parameterized token — no boilerplate
    ) -> None:
        self._agent = agent
        self._runner = runner

    async def run(self, ctx: ToolContext, task: str) -> dict:
        response = await self._runner.run(self._agent, task,
                                          execution_context=ctx.execution_context)
        return {"result": response.content}

# Target module: AgentRunner[SpecialistAgent] is auto-registered
SpecialistMod = AgentModule.for_root(
    agents=[SpecialistAgent],
    tools=[SpecialistTool1],
    imports=[LLMProvider],
)

# Calling module: owns the delegation tool; imports target module so
# AgentRunner[SpecialistAgent] is visible to DI when resolving DelegateToSpecialist
OrchestratorMod = AgentModule.for_root(
    agents=[OrchestratorAgent],
    tools=[DelegateToSpecialist],          # ← delegation tool lives HERE
    imports=[LLMProvider, SpecialistMod],  # ← makes AgentRunner[SpecialistAgent] visible
)
```

**Rule:** The delegation tool belongs in the **calling module's `tools=`**. The
calling module imports the target module so the target runner token is visible.

### @traced() — tracing & observability

```python
from lauren_ai import traced, SpanKind, TraceStore, InMemoryTraceExporter, set_trace_store

exporter = InMemoryTraceExporter()
store = TraceStore()
store._exporters = [exporter]
set_trace_store(store)

@traced(name="my_handler", kind=SpanKind.AGENT)
async def my_handler(request): ...
```

### Cost & Rate Tracking

```python
from lauren_ai import CostTracker, TokenBudget, RateLimiter, default_pricing_table

tracker = CostTracker(pricing=default_pricing_table())

budget = TokenBudget(max_tokens_per_conversation=100_000)
# Raises BudgetExceededError when token limit hit

limiter = RateLimiter(requests_per_minute=60)
await limiter.acquire()
# Raises RateLimitExhaustedError when ceiling breached
```

### StructuredLLM — typed LLM outputs

```python
from lauren_ai import StructuredLLM, LLMService, Message
from pydantic import BaseModel

class Sentiment(BaseModel):
    label: str
    score: float

structured = llm_service.with_structured_output(Sentiment)
result: Sentiment = await structured.complete([Message.user("Great product!")])
```

### Prompt Templates & Chains

```python
from lauren_ai import PromptTemplate, ChatPromptTemplate, Chain, StrOutputParser

tmpl = PromptTemplate("Summarise: {text}")
chain = tmpl | llm_service | StrOutputParser()
result = await chain.invoke({"text": "Some long document..."})
```

### Output Parsers

```python
from lauren_ai import StrOutputParser, JSONOutputParser, PydanticOutputParser

parser = PydanticOutputParser(MyModel)
chain = prompt | llm_service | parser
```

Available: `StrOutputParser`, `JSONOutputParser`, `RegexParser`,
`CommaSeparatedListParser`, `MarkdownCodeBlockParser`, `PydanticOutputParser`,
`RetryOutputParser`.

### SemanticRouter

```python
from lauren_ai import SemanticRouter, Route, RouteMatch

router = SemanticRouter(
    routes=[Route(name="weather", examples=["What's the weather?", "Is it raining?"])],
    embed_fn=embed_fn,
    min_confidence=0.7,
)
await router.compile()
match: RouteMatch = await router.route("Will it snow today?")
```

### Multimodal inputs

```python
from lauren_ai import ImageContent, AudioContent, DocumentContent, Message

msg = Message.from_multimodal("user", [
    "Describe this image:",
    ImageContent(url="https://example.com/img.png"),
])
```

## Decorator quick-reference

| Decorator | Purpose | Parentheses required |
|-----------|---------|---------------------|
| `@tool()` | Expose a function or class as an agent tool | Yes |
| `@agent()` | Declare an agent class | Yes |
| `@use_tools(t1, t2)` | Attach tools to an agent | Yes |
| `@use_knowledge_sources(ks, ...)` | Opt agent into specific KB sources (opt-in — no decorator = no KB tools) | Yes |
| `@use_guardrails(...)` | Attach guardrail instances to an agent | Yes |
| `@guardrail()` | Mark a class as a DI-injectable guardrail | Yes |
| `@remember()` | Enable persistent user memory | Yes |
| `@team()` | Declare a multi-agent team | Yes |
| `@traced()` | Wrap with tracing spans | Yes |

## Decorator ordering (mandatory)

```
@agent()                    ← outermost
@remember()                 ← optional
@use_guardrails()           ← optional
@use_knowledge_sources()    ← optional (KB opt-in; must be below @agent)
@use_tools()                ← innermost
class Agent: ...
```

## Critical rule — tool annotations must resolve

`from __future__ import annotations` is supported in both agent and tool files, but
`@tool()` still resolves parameter types when the module is imported. Keep tool
annotations importable and avoid unresolved forward references or circular imports,
especially in function-form tools.

## Links

- Documentation: https://docs.lauren-framework.dev/lauren-ai/
- GitHub: https://github.com/lauren-framework/lauren-ai
- PyPI: https://pypi.org/project/lauren-ai/
- Changelog: https://github.com/lauren-framework/lauren-ai/blob/main/CHANGELOG.md