# Agent Templates — Rules for Modification These rules apply ONLY when modifying agent templates in this repository. An "agent template" is any top-level directory with the `agent-` prefix, excluding TypeScript/JavaScript templates (e.g., `agent-langchain-ts`). ## CRITICAL: Sync After Modifying Shared Sources Shared files are copied from source-of-truth directories into each template. **Never manually edit synced copies** (e.g. `{template}/quickstart.py`, `{template}/start_app.py`, `{template}/evaluate_agent.py`) — always edit the source file in `.scripts/source/` or `.claude/skills/` first, then run the sync command to propagate changes to all templates. | Source directory | Sync command | What it syncs | |---|---|---| | `.scripts/source/` | `uv run python .scripts/sync-scripts.py` | `quickstart.py`, `start_app.py`, `evaluate_agent.py` | | `.claude/skills/` | `uv run python .scripts/sync-skills.py` | All skill directories under `{template}/.claude/skills/` | After modifying any source file, run the corresponding sync command and commit the synced copies. ## Template Registry `.scripts/templates.py` is the canonical registry — see `TEMPLATES` dict for the full list of templates, their SDKs, and bundle names. All sync scripts and e2e tests import from it. To add a new template: add its entry to `TEMPLATES` in `.scripts/templates.py`, create the directory, then run both sync commands. ## Template Conventions ### Handler naming All handlers live in `agent_server/agent.py`. Use `@invoke()` and `@stream()` decorators from `mlflow.genai.agent_server`: ```python @invoke() async def invoke_handler(request: ResponsesAgentRequest) -> ResponsesAgentResponse: ... @stream() async def stream_handler(request: ResponsesAgentRequest) -> AsyncGenerator[ResponsesAgentStreamEvent, None]: ... ``` LangGraph `invoke_handler` delegates to `stream_handler`. OpenAI SDK `invoke_handler` calls `Runner.run()` independently. ### MLflow autologging - LangGraph templates: `mlflow.langchain.autolog()` - OpenAI SDK templates: `mlflow.openai.autolog()` + `set_trace_processors([])` All handlers tag traces with: `mlflow.update_current_trace(metadata={"mlflow.trace.session": session_id})` ### MCP server initialization - **LangGraph**: `DatabricksMultiServerMCPClient` wrapping `DatabricksMCPServer` objects. Tools fetched once at agent init via `.get_tools()`. - **OpenAI SDK**: `McpServer` used as async context manager per-request: `async with await init_mcp_server() as mcp_server:` ### Session/memory patterns | Pattern | SDK | Class | ID field | |---|---|---|---| | Short-term memory | LangGraph | `AsyncCheckpointSaver` | `thread_id` via `config["configurable"]` | | Long-term memory | LangGraph | `AsyncDatabricksStore` | `user_id` via `config["configurable"]` | | Short-term memory | OpenAI | `AsyncDatabricksSession` | `session_id` passed to `Runner.run(..., session=)` | All memory templates return the ID in `custom_outputs` so clients can reuse it. ### `databricks.yml` conventions - `bundle.name` uses underscores: `agent_langgraph` - App `name` uses hyphens: `agent-langgraph` - Memory template app names are abbreviated (`-stm`, `-ltm`) to stay within the 30-char limit - App command: `["uv", "run", "start-app"]` - Lakebase resources use `permission: 'CAN_CONNECT_AND_CREATE'` - Lakebase templates use `` as placeholder — quickstart replaces it - Templates do not define a DAB-managed experiment resource (`resources.experiments`); instead, the app resource references an experiment by ID (initially empty), and quickstart fills in the literal experiment ID ### `app.yaml` files The 3 base templates (`agent-langgraph`, `agent-openai-agents-sdk`, `agent-non-conversational`) have `app.yaml` files for UI-based template creation in the Databricks UI. These are separate from `databricks.yml` and use `valueFrom` (camelCase) for resource references. Memory/multiagent variants do not need separate `app.yaml` files. ### Per-template AGENTS.md Each template has its own `{template}/AGENTS.md` (loaded via `{template}/CLAUDE.md`). These are maintained individually and are **not synced** — they contain template-specific guidance for end users. ## Tests ### Quickstart unit tests The quickstart unit tests (`test_quickstart.py`) live in `.scripts/source/` but have no `pyproject.toml` of their own. Run them from `.scripts/agent-integration-tests/` which has pytest installed: ```bash cd .scripts/agent-integration-tests && uv run pytest ../source/test_quickstart.py -v ``` ### E2E integration tests Tests live in `.scripts/agent-integration-tests/`. Run from that directory. **Default runs both local and deploy.** Only add `--skip-deploy` or `--skip-local` when the user explicitly asks. **CRITICAL: Never pipe test output through `head`, `tail`, or other truncating commands.** This sends SIGPIPE to pytest, killing it before cleanup and leaving template files dirty. Always run pytest directly and use `run_in_background` to check on it later. ```bash # DEFAULT: All templates in parallel (local + deploy) uv run pytest test_e2e.py -v -n 7 # Single template (still runs both local + deploy) uv run pytest test_e2e.py -v --template agent-langgraph # Sequential with live output (for debugging) uv run pytest test_e2e.py -v -n0 -s --template agent-langgraph ``` Template test configs are in `.scripts/agent-integration-tests/template_config.py`. ## Editing Workflow Summary 1. **Changing a shared script** (`quickstart.py`, `start_app.py`, `evaluate_agent.py`) — edit in `.scripts/source/`, run `uv run python .scripts/sync-scripts.py` 2. **Changing a skill** — edit in `.claude/skills/`, run `uv run python .scripts/sync-skills.py` 3. **Changing template-specific agent code** — edit directly in `{template}/agent_server/` 4. **Adding a new template** — add to `.scripts/templates.py`, create directory, run both sync commands 5. **Changing `databricks.yml`** — edit directly in the template (not synced) 6. **After any change** — run e2e tests: `cd .scripts/agent-integration-tests && uv run pytest test_e2e.py -v -n 7 --skip-deploy` 7. **After any change** — review this file (`.claude/AGENTS.md`) and each affected template's `AGENTS.md` for inaccuracies, then update them to reflect the new state