--- name: autopilot description: Full autonomous execution from idea to working code --- Autopilot takes a brief product idea and autonomously handles the full lifecycle: requirements analysis, technical design, planning, parallel implementation, QA cycling, and multi-perspective validation. It produces working, verified code from a 2-3 line description. - User wants end-to-end autonomous execution from an idea to working code - User says "autopilot", "auto pilot", "autonomous", "build me", "create me", "make me", "full auto", "handle it all", or "I want a/an..." - Task requires multiple phases: planning, coding, testing, and validation - User wants hands-off execution and is willing to let the system run to completion - User wants to explore options or brainstorm -- use `plan` skill instead - User says "just explain", "draft only", or "what would you suggest" -- respond conversationally - User wants a single focused code change -- use `ralph` or delegate to an executor agent - User wants to review or critique an existing plan -- use `plan --review` - Task is a quick fix or small bug -- use direct executor delegation Most non-trivial software tasks require coordinated phases: understanding requirements, designing a solution, implementing in parallel, testing, and validating quality. Autopilot orchestrates all of these phases automatically so the user can describe what they want and receive working code without managing each step. - Each phase must complete before the next begins - Parallel execution is used within phases where possible (Phase 2 and Phase 4) - QA cycles repeat up to 5 times; if the same error persists 3 times, stop and report the fundamental issue - Validation requires approval from all reviewers; rejected items get fixed and re-validated - Cancel with `/cancel` at any time; progress is preserved for resume 1. **Phase 0 - Expansion**: Turn the user's idea into a detailed spec - Analyst (Opus): Extract requirements - Architect (Opus): Create technical specification - Output: `.omx/plans/autopilot-spec.md` 2. **Phase 1 - Planning**: Create an implementation plan from the spec - Architect (Opus): Create plan (direct mode, no interview) - Critic (Opus): Validate plan - Output: `.omx/plans/autopilot-impl.md` 3. **Phase 2 - Execution**: Implement the plan using Ralph + Ultrawork - Executor-low (Haiku): Simple tasks - Executor (Sonnet): Standard tasks - Executor-high (Opus): Complex tasks - Run independent tasks in parallel 4. **Phase 3 - QA**: Cycle until all tests pass (UltraQA mode) - Build, lint, test, fix failures - Repeat up to 5 cycles - Stop early if the same error repeats 3 times (indicates a fundamental issue) 5. **Phase 4 - Validation**: Multi-perspective review in parallel - Architect: Functional completeness - Security-reviewer: Vulnerability check - Code-reviewer: Quality review - All must approve; fix and re-validate on rejection 6. **Phase 5 - Cleanup**: Clear all mode state via OMX MCP tools on successful completion - `state_clear({mode: "autopilot"})` - `state_clear({mode: "ralph"})` - `state_clear({mode: "ultrawork"})` - `state_clear({mode: "ultraqa"})` - Or run `/cancel` for clean exit - Before first MCP tool use, call `ToolSearch("mcp")` to discover deferred MCP tools - Use `ask_codex` with `agent_role: "architect"` for Phase 4 architecture validation - Use `ask_codex` with `agent_role: "security-reviewer"` for Phase 4 security review - Use `ask_codex` with `agent_role: "code-reviewer"` for Phase 4 quality review - Agents form their own analysis first, then consult Codex for cross-validation - If ToolSearch finds no MCP tools or Codex is unavailable, proceed without it -- never block on external tools ## State Management Use `omx_state` MCP tools for autopilot lifecycle state. - **On start**: `state_write({mode: "autopilot", active: true, current_phase: "expansion", started_at: ""})` - **On phase transitions**: `state_write({mode: "autopilot", current_phase: "planning"})` `state_write({mode: "autopilot", current_phase: "execution"})` `state_write({mode: "autopilot", current_phase: "qa"})` `state_write({mode: "autopilot", current_phase: "validation"})` - **On completion**: `state_write({mode: "autopilot", active: false, current_phase: "complete", completed_at: ""})` - **On cancellation/cleanup**: run `$cancel` (which should call `state_clear(mode="autopilot")`) User: "autopilot A REST API for a bookstore inventory with CRUD operations using TypeScript" Why good: Specific domain (bookstore), clear features (CRUD), technology constraint (TypeScript). Autopilot has enough context to expand into a full spec. User: "build me a CLI tool that tracks daily habits with streak counting" Why good: Clear product concept with a specific feature. The "build me" trigger activates autopilot. User: "fix the bug in the login page" Why bad: This is a single focused fix, not a multi-phase project. Use direct executor delegation or ralph instead. User: "what are some good approaches for adding caching?" Why bad: This is an exploration/brainstorming request. Respond conversationally or use the plan skill. - Stop and report when the same QA error persists across 3 cycles (fundamental issue requiring human input) - Stop and report when validation keeps failing after 3 re-validation rounds - Stop when the user says "stop", "cancel", or "abort" - If requirements were too vague and expansion produces an unclear spec, pause and ask the user for clarification before proceeding - [ ] All 5 phases completed (Expansion, Planning, Execution, QA, Validation) - [ ] All validators approved in Phase 4 - [ ] Tests pass (verified with fresh test run output) - [ ] Build succeeds (verified with fresh build output) - [ ] State files cleaned up - [ ] User informed of completion with summary of what was built ## Configuration Optional settings in `~/.codex/config.toml`: ```toml [omc.autopilot] maxIterations = 10 maxQaCycles = 5 maxValidationRounds = 3 pauseAfterExpansion = false pauseAfterPlanning = false skipQa = false skipValidation = false ``` ## Resume If autopilot was cancelled or failed, run `/autopilot` again to resume from where it stopped. ## Best Practices for Input 1. Be specific about the domain -- "bookstore" not "store" 2. Mention key features -- "with CRUD", "with authentication" 3. Specify constraints -- "using TypeScript", "with PostgreSQL" 4. Let it run -- avoid interrupting unless truly needed ## Troubleshooting **Stuck in a phase?** Check TODO list for blocked tasks, run `state_read({mode: "autopilot"})`, or cancel and resume. **QA cycles exhausted?** The same error 3 times indicates a fundamental issue. Review the error pattern; manual intervention may be needed. **Validation keeps failing?** Review the specific issues. Requirements may have been too vague -- cancel and provide more detail.