# 12 Factor Agents Reference

A quick reference for applying 12 Factor Agents principles to multi-agent system design. Based on the [12 Factor Agents](https://github.com/humanlayer/12-factor-agents) framework.

## The 12 Factors

| # | Factor | Principle | Primary Skill |
|---|--------|-----------|---------------|
| 1 | Natural language to tool calls | LLMs transform sentences into structured JSON | agent-specification |
| 2 | Own your prompts | Don't let frameworks abstract away prompts | agent-specification |
| 3 | Own your context building | Everything good is context engineering | coordination-patterns |
| 4 | Tools are structured outputs | Tool-use = JSON + deterministic code | agent-specification |
| 5/6 | Unified state | Execution + business state together | coordination-patterns |
| 7 | Contact humans with tools | HITL as first-class tool pattern | agent-specification |
| 8 | Own your control flow | Don't let LLM control the DAG | coordination-patterns |
| 9 | Compact errors into context | Feed errors back for self-correction | production-readiness |
| 10 | Small focused agents | Focused prompts beat long autonomous runs | mas-decision-gate |
| 11 | Trigger from anywhere | Channel-agnostic agent triggering | production-readiness |
| 12 | Stateless reducers | (state, event) -> new_state | coordination-patterns |

## Factor Quick Checks

Use this checklist when designing agents:

### Specification Phase (Factors 1, 2, 4, 7)

- [ ] **F1**: Tool call schemas explicitly defined (JSON input/output)
- [ ] **F2**: Prompts visible and version-controlled (not hidden in framework)
- [ ] **F4**: Tools demystified as JSON + code (call, execution, result schemas)
- [ ] **F7**: Human contact tool included if high-stakes decisions involved

### Coordination Phase (Factors 3, 5/6, 8, 12)

- [ ] **F3**: Context building explicit and owned (prompt + RAG + memory + history)
- [ ] **F5/6**: Execution and business state unified (launch/pause/resume)
- [ ] **F8**: Control flow owned by code, not LLM (break/switch/summarize/judge)
- [ ] **F12**: State transitions as pure reducers (replay, debug, test)

### Decision Phase (Factor 10)

- [ ] **F10**: Verified that single agent can't solve the problem
- [ ] **F10**: Each agent in MAS is small and focused
- [ ] **F10**: Complexity level appropriate (Level 0-4 progression)

### Operations Phase (Factors 9, 11)

- [ ] **F9**: Error context manager implemented (spin-out prevention)
- [ ] **F9**: Error counters per tool (max 3 attempts default)
- [ ] **F11**: Trigger interface supports required channels
- [ ] **F11**: Response routing configured per channel

## Factor Details

### Factor 1: Natural Language to Tool Calls

**Principle**: LLMs transform natural language into structured tool calls (JSON).

**Key insight**: The "magic" is in reliable structured output generation. Specify schemas explicitly.

**Implementation**: Define tool call schemas with exact JSON format for inputs and outputs.

### Factor 2: Own Your Prompts

**Principle**: LLMs are pure functions (tokens in -> tokens out). Don't let frameworks abstract away prompts.

**Key insight**: The prompt IS the agent specification. Version them together.

**Anti-pattern**: "Let the framework handle prompts"

**Checklist**:
- System prompt visible, not hidden in framework
- Prompt changes require code review
- A/B testing capability for prompt variants

### Factor 3: Own Your Context Building

**Principle**: Everything that makes agents good is context engineering.

**Context components**:
1. System prompt (agent identity)
2. RAG results (relevant knowledge)
3. Memory (past experiences)
4. Agentic history (workflow context)
5. Task input (current request)

**Key insight**: If you don't understand what happens at the token level, you miss optimization opportunities.

### Factor 4: Tools Are Structured Outputs

**Principle**: "Tool-Use" is just JSON output + deterministic code execution.

**Tool specification includes**:
- Call Schema (what agent outputs)
- Execution (what code does)
- Result Schema (what feeds back)

**Key insight**: Demystify tools. Nothing magical about them.

### Factor 5/6: Unified Execution and Business State

**Principle**: Enable Launch/Pause/Resume with simple APIs.

**Unified state includes**:
- Execution state: current step, next step, waiting status, retry config
- Business state: messages, tool calls, tool results, decisions made

**Benefits**: Pause anywhere, resume exactly, debug easily, replay possible.

### Factor 7: Contact Humans with Tools

**Principle**: Human-in-the-loop is a first-class pattern.

**When to use**:
- High-stakes decisions
- Ambiguous requirements
- Compliance/approval workflows
- Error recovery beyond agent capability

**Implementation**: `request_human_input` tool with urgency levels and timeout actions.

### Factor 8: Own Your Control Flow

**Principle**: Don't let the LLM control the entire DAG.

**Control flow operations**:
- **Break**: Stop agent loop early
- **Switch**: Route to different agent
- **Summarize**: Compress context
- **Judge**: Evaluate quality

**Key insight**: Code-controlled DAG beats LLM-controlled DAG. Smaller focused prompts always win.

### Factor 9: Compact Errors into Context Window

**Principle**: Feed errors back into context so agents can self-correct.

**Spin-out prevention**:
- Max 3 errors per tool
- Max 5 total errors
- Escalate after repeated failures

**Key insight**: Error counters prevent infinite retry loops.

### Factor 10: Small Focused Agents

**Principle**: Smaller focused prompts with controlled context always beat long autonomous runs.

**The progression**:
```
Level 0: Deterministic workflow (no agent)
Level 1: Single focused agent
Level 2: Single agent with tools
Level 3: Minimal MAS (Planner->Executor->Verifier)
Level 4: Full MAS (when justified)
```

**Key insight**: Most tasks belong at Level 0-2. Only advance when evidence supports it.

### Factor 11: Trigger from Anywhere

**Principle**: Meet users where they are. Agent triggering should be channel-agnostic.

**Supported channels**: Slack, Email, CLI, API, Webhook, Dashboard

**Key insight**: Same workflow, any channel. Response routes back to origin.

### Factor 12: Stateless Reducers

**Principle**: Agent logic as pure functions: (state, event) -> new_state

**Benefits**:
- **Replay**: Feed same events, get same state
- **Debugging**: Inspect state at any point
- **Testing**: Pure functions are easy to test
- **Time travel**: Rollback by replaying subset of events

## Cross-References

### By Factor

| Factor | Primary File | Reference Files |
|--------|--------------|-----------------|
| F1, F2, F4, F7 | `agent-specification/SKILL.md` | `spec-templates.md`, `common-mistakes.md` |
| F3, F5/6, F8, F12 | `coordination-patterns/SKILL.md` | `state-management.md` |
| F9, F11 | `production-readiness/SKILL.md` | `ops-runbook.md` |
| F10 | `mas-decision-gate/SKILL.md` | `decision-tree.md` |

### By Topic

| Topic | Factors | Files |
|-------|---------|-------|
| Tool specification | F1, F4 | `agent-specification/SKILL.md`, `common-mistakes.md` |
| Prompt engineering | F2 | `agent-specification/SKILL.md`, `common-mistakes.md` |
| Context engineering | F3 | `coordination-patterns/SKILL.md` |
| State management | F5/6, F12 | `coordination-patterns/SKILL.md`, `state-management.md` |
| Human-in-the-loop | F7 | `agent-specification/SKILL.md`, `spec-templates.md` |
| Control flow | F8 | `coordination-patterns/SKILL.md` |
| Error handling | F9 | `production-readiness/SKILL.md`, `ops-runbook.md` |
| Simplicity | F10 | `mas-decision-gate/SKILL.md` |
| Multi-channel | F11 | `production-readiness/SKILL.md`, `ops-runbook.md` |

## Quick Decision Tree

```
Starting a new agent project?
│
├─ Is it truly non-deterministic? ──No──> Use scripts/workflows (Level 0)
│
├─ Can single agent handle it? ───Yes──> Single agent (Level 1-2)
│     │
│     └─ Does it need tools? ─────Yes──> Single agent + tools (Level 2)
│
├─ Is verification critical? ─────Yes──> Minimal MAS (Level 3)
│     │                                   Planner -> Executor -> Verifier
│     └─ Do you own control flow? ─No──> Add code-controlled DAG (F8)
│
└─ Multiple domains required? ────Yes──> Full MAS (Level 4)
      │                                   Only with evidence
      └─ Apply ALL 12 factors
```

## Further Reading

The 12 Factor Agents principles come from applying software engineering best practices (inspired by the 12-factor app methodology) to AI agent development.

**Core insights**:
1. **Demystify**: Agents are code + prompts + tools. No magic.
2. **Own everything**: Context, prompts, control flow, state.
3. **Design for operations**: Errors, channels, debugging.
4. **Start simple**: Single agent first, MAS only when justified.

**Source**: [humanlayer/12-factor-agents](https://github.com/humanlayer/12-factor-agents)