--- name: autonomous-builder version: "1.0.0" description: "Full-stack software development agent for design, implementation, testing, and deployment. Use when the user explicitly asks for end-to-end project creation, feature development, bug fixing, or code refactoring." user-invocable: true allowed-tools: - Read - Write - Edit - Bash - Glob - Grep - WebFetch - WebSearch - Skill - Task - ToolSearch - mcp__ide__executeCode - mcp__ide__getDiagnostics --- # Autonomous Builder A fully autonomous software development agent that handles the complete software lifecycle: requirements analysis, architecture design, implementation, testing, debugging, and deployment. ## Architecture Pattern: Two-Agent Model **Based on Anthropic's official claude-quickstarts architecture** ``` ┌─────────────────────────────────────────────────────────────────┐ │ TWO-AGENT ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ SESSION 1: INITIALIZER AGENT │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ • Read requirements / spec │ │ │ │ • Create project structure │ │ │ │ • Generate feature_list.json (200+ tests) │ │ │ │ • Initialize Git repository │ │ │ │ • ✨ Prompt for GitHub URL (optional) │ │ │ │ • ✨ Create README.md & PLANNING.md │ │ │ │ • Commit initial state │ │ │ │ • ✨ Push to GitHub & create issues │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ │ feature_list.json │ │ (Single Source of Truth) │ │ │ │ │ SESSIONS 2+: BUILDER AGENT (fresh context each session) │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Step 1: Get Context (pwd, ls, git log, progress) │ │ │ │ Step 2: Start/verify server │ │ │ │ Step 3: Verify previous tests (regression check) │ │ │ │ Step 4: Select next "passes": false feature │ │ │ │ Step 5: Implement feature │ │ │ │ Step 6: Browser automation test │ │ │ │ Step 7: Update feature_list.json │ │ │ │ Step 8: Generate workflow report │ │ │ │ Step 9: Git commit + GitHub push │ │ │ │ Step 10: Update progress notes │ │ │ │ Step 11: Clean exit (auto-continue in 3s) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **Key Design Principles (Official Pattern):** 1. **Fresh Context Per Session** - Each session uses brand new context window 2. **File-Based State Persistence** - Progress via feature_list.json, not context 3. **Git Commit as State Anchor** - Atomic progress units with easy rollback 4. **Browser Automation Testing** - Act like human user, verify via UI 5. **Auto-Continue with Delay** - 3 second delay between sessions ## Core Philosophy **The Autonomous Development Loop:** ``` PLAN -> BUILD -> TEST -> DEBUG -> DEPLOY -> (REPEAT) | | +------------------------------------+ ``` **Key Principles:** 1. **Self-Sufficient**: No user intervention required during execution 2. **State-Persistent**: Recovers from interruptions via `.builder/` state files 3. **Multi-Language**: Auto-detects and adapts to project technology stack 4. **Incremental**: Completes one feature at a time, commits progress 5. **Error-Resilient**: 3-strike protocol with automatic recovery strategies ## When to Use This Skill Use this skill when the user explicitly wants this agent to own an end-to-end build or major refactor, such as: - Starting a new project from a full specification - Continuing a previously initialized `.builder/` project - Driving a broad feature build across multiple implementation steps - Performing an explicit refactor or modernization effort across the codebase Use stage assistants or other routed specialists for narrow bug fixes, one-off debugging, or scoped edits that do not need full lifecycle ownership. ## Not For / Boundaries - **Security-critical systems** without human review - **Production deployments** without user confirmation - **Legal/compliance-sensitive code** without audit - **Data migration** without backup verification - **Infrastructure changes** without explicit approval - **System-level operations** outside workspace (see SAFETY CRITICAL below) **Required inputs (ask if missing):** 1. Project requirements or specification 2. Target platform/environment (web, CLI, mobile, etc.) 3. Preferred language/framework (or auto-detect) **Safety First:** All operations that could affect system stability, data integrity, or files outside the workspace require explicit user approval. See **SAFETY CRITICAL** section below for details. ## Quick Reference ### Session Continuity (Auto-Resume) **⚠️ Critical for Unattended Long-Running Operation** ``` AUTO-RESUME PROTOCOL: ┌─────────────────────────────────────────────────────────────────┐ │ Session Start │ │ │ │ │ ▼ │ │ Check .builder/state.json exists? │ │ │ │ │ ├─ NO → Initialize new project │ │ │ │ │ └─ YES → Resume from saved state: │ │ 1. Read current_phase │ │ 2. Read current_feature │ │ 3. Read pending_features[] │ │ 4. Continue from last checkpoint │ │ │ │ After each feature completion: │ │ │ │ │ ▼ │ │ More pending features? │ │ │ │ │ ├─ YES → Auto-start next feature (NO user input needed) │ │ │ │ │ └─ NO → All complete! Generate report │ └─────────────────────────────────────────────────────────────────┘ ``` **Auto-Continue Rules:** | Condition | Action | User Input Required | |-----------|--------|---------------------| | Feature completed, more pending | Auto-start next | **NO** | | Error recovered successfully | Continue current | **NO** | | 3-strike error failed | Skip and continue | **NO** (unless critical) | | Loop detected & resolved | Resume from checkpoint | **NO** | | All features complete | Generate final report | **NO** | **State Persistence After Each Operation:** ```json { "auto_continue": true, "resume_token": "feat-003-phase-implement", "next_action": "Continue implementing feat-003", "features_remaining": 3, "estimated_completion": "2026-02-14T18:00:00Z" } ``` ### Automatic Task Queue ```python # After completing a feature, automatically proceed: def on_feature_complete(feature_id: str, state: ProjectState): """Called when a feature is marked complete.""" # 1. Save checkpoint save_checkpoint(state, feature_id) # 2. Update feature status state.features[feature_id].status = "completed" state.features[feature_id].completed_at = datetime.now() # 3. Check for pending features pending = [f for f in state.features if f.status == "pending"] if pending: # 4. Auto-select next feature (NO user input) next_feature = select_next_feature(pending, state) state.current_feature = next_feature.id state.current_phase = "implement" # 5. Save state immediately save_state(state) # 6. LOG and CONTINUE (not ask user) log_progress(f"Auto-continuing to {next_feature.name}") return ContinueAction(feature=next_feature) else: # All complete! return CompleteAction(report=generate_final_report(state)) ``` **Resume Message on Session Start:** ```markdown ## 🔄 Session Resume Detected **Previous Session**: Session #5 **Last Activity**: 2 hours ago **Current Feature**: feat-003 (User Authentication) **Phase**: implement (60% complete) **Pending Features**: 3 remaining - feat-004: API Rate Limiting - feat-005: Email Notifications - feat-006: Final Documentation **Auto-Continuing**: Resuming feat-003 implementation... [Proceeding without user input - type "pause" to stop] ``` ### Directory Structure ``` .builder/ ├── state.json # Current project state ├── features.json # Feature list with status ├── architecture.md # Design decisions ├── progress.md # Session log ├── errors.json # Error history and resolutions ├── checkpoints/ # Recovery checkpoints ├── auto-continue.{sh,bat,ps1} # Auto-restart script (auto-generated) └── supervisor.json # Self-supervision config ``` ### Skill Recommendations & Router Handoff **⚠️ Skill discovery is advisory. The host router remains the only main-route authority.** ```markdown ON PROJECT INITIALIZATION: 1. Check for Claude_Skills_中文指南.md in workspace root 2. If found: - Read and parse skill catalog - Store available skills in state.json 3. For each feature: - Analyze feature requirements - Match against skill catalog - Add recommended_skills to feature definition as router-handoff suggestions DURING IMPLEMENTATION: 1. Before each implementation step: - Check step's invoke_skill field - Or analyze step for skill match 2. Request router-approved handoff: - Propose the matched skill to the host router or current route authority - Use the Skill tool only after that router-authorized handoff or an explicit user request - Continue with the returned guidance once the handoff is granted 3. Log router-approved skill usage to state.json ``` **Task-to-Skill Mapping (Recommended):** | Task Type | Recommended Skills | |-----------|--------------------| | Code review | `code-reviewer` | | Data analysis | `exploratory-data-analysis`, `statistical-analysis` | | Visualization | `data-artist`, `matplotlib`, `plotly` | | ML training | `senior-ml-engineer`, `pytorch-lightning` | | ML evaluation | `evaluating-machine-learning-models`, `shap` | | Scientific writing | `scientific-writing`, `scientific-schematics` | | Debugging | `systematic-debugging` | | Documentation | `docs-write`, `writing-docs` | | Architecture | `architecture-patterns` | | Bioinformatics | `biopython`, `bio-database-evidence` | | Drug discovery | `torchdrug`, `rdkit`, `uniprot-database` | **Feature with Skill Planning:** ```json { "id": "feat-001", "name": "Data Analysis Module", "recommended_skills": [ {"skill": "exploratory-data-analysis", "phase": "implementation"}, {"skill": "data-artist", "phase": "implementation"} ], "skill_dispatch_schedule": [ {"step": 1, "action": "Explore data", "invoke_skill": "exploratory-data-analysis", "router_handoff_required": true}, {"step": 2, "action": "Create charts", "invoke_skill": "data-artist", "router_handoff_required": true} ] } ``` **Setup**: Place `Claude_Skills_中文指南.md` in workspace root. Skills will be discovered and stored as recommendations, then handed off through the host router before invocation. ### MCP Auto-Integration & Human-like Computer Control **⚠️ Enables browser automation, desktop control, and seamless tool invocation** ```markdown ON SESSION START: 1. DISCOVER MCP servers - Run /mcp to list configured servers - Parse available tools from each server - Build capability map 2. CHECK critical capabilities: - browser_automation (puppeteer) - code_execution (ide) - desktop_control (desktop) - optional 3. AUTO-INSTALL missing servers if needed: - For web projects: puppeteer - For desktop apps: desktop - For database work: sqlite/postgres 4. UPDATE state.json → mcp_integration ``` **MCP Capability Matrix:** | Capability | MCP Server | What It Enables | |------------|------------|-----------------| | Browser automation | puppeteer | Navigate, click, type, screenshot | | Desktop control | desktop | Mouse, keyboard, screen capture | | Code execution | ide | Run Python, get diagnostics | | Database | sqlite/postgres | Query, insert, manage data | | Web search | brave-search | Research, documentation lookup | | HTTP requests | fetch | API testing, web fetching | **Auto-Tool Selection:** ``` Task Pattern → MCP Tool ───────────────────────────────────────────── "open website/url" → mcp__puppeteer_navigate "click button/element" → mcp__puppeteer_click "fill form/type text" → mcp__puppeteer_type "take screenshot" → mcp__puppeteer_screenshot "run JavaScript" → mcp__puppeteer_evaluate "control mouse" → mcp__desktop_mouse_move "press key/hotkey" → mcp__desktop_hotkey "execute Python" → mcp__ide__executeCode ``` **Example: Automated Web Testing** ```markdown ## E2E Test Flow (Automatic) 1. mcp__puppeteer_navigate → "https://myapp.com" 2. mcp__puppeteer_screenshot → capture initial state 3. mcp__puppeteer_fill → "#username", "testuser" 4. mcp__puppeteer_click → "#submit" 5. mcp__puppeteer_wait → ".dashboard" 6. mcp__puppeteer_evaluate → verify page state 7. mcp__puppeteer_screenshot → capture result ``` **Custom MCP Server Creation:** When no existing MCP server fits the task, autonomous-builder can: 1. Identify requirement 2. Design custom MCP server 3. Write server code to `.builder/mcp-servers/` 4. Register with `claude mcp add` 5. Use immediately ### Auto-Restart & Self-Supervision **⚠️ Enables true unattended long-running operation** ```markdown ON PROJECT INITIALIZATION: 1. Create .builder/ directory 2. Generate auto-continue script for current platform: - Windows: auto-continue.ps1 - Linux/macOS: auto-continue.sh 3. Create supervisor.json with monitoring config 4. Script runs in background, monitors session health ``` **Auto-Generated Supervisor Script:** ```bash #!/bin/bash # .builder/auto-continue.sh - Auto-generated by autonomous-builder PROJECT_DIR="/path/to/project" BUILDER_DIR="$PROJECT_DIR/.builder" STATE_FILE="$BUILDER_DIR/state.json" SUPERVISOR_CONFIG="$BUILDER_DIR/supervisor.json" # Self-supervision loop while true; do # Check if project is complete if [ -f "$STATE_FILE" ]; then STATUS=$(grep -o '"status"[[:space:]]*:[[:space:]]*"[^"]*"' "$STATE_FILE" | head -1 | cut -d'"' -f4) if [ "$STATUS" = "completed" ]; then echo "[$(date)] Project completed. Exiting supervisor." exit 0 fi fi # Check last activity (if no activity for 5 min, restart) LAST_ACTIVITY=$(grep -o '"last_activity"[[:space:]]*:[[:space:]]*"[^"]*"' "$STATE_FILE" | cut -d'"' -f4) if [ -n "$LAST_ACTIVITY" ]; then # Parse and check timeout... # If timeout exceeded, trigger new session fi # Start/resume Claude session with permission bypass for unattended operation # WARNING: --dangerously-skip-permissions bypasses all user confirmations echo "[$(date)] Starting Claude session..." claude --skill autonomous-builder --project "$PROJECT_DIR" --dangerously-skip-permissions # Log session end echo "[$(date)] Session ended. Checking state..." # Wait before restart (configurable) sleep 5 done ``` **⚠️ Security Warning:** `--dangerously-skip-permissions` bypasses ALL user confirmations. Use only in trusted, isolated environments. Ensure workspace isolation and safety protocols are properly configured. **Supervisor Configuration:** ```json { "supervisor_version": "1.0", "project_path": "/path/to/project", "enabled": true, "monitoring": { "check_interval_seconds": 60, "session_timeout_seconds": 300, "max_restart_attempts": 10, "restart_cooldown_seconds": 5 }, "health_checks": { "progress_stall_threshold": 600, "error_rate_threshold": 0.5, "context_usage_warning": 0.8 }, "notifications": { "on_completion": true, "on_error_spike": true, "on_stall": true, "log_file": ".builder/supervisor.log" }, "statistics": { "total_sessions": 0, "total_restarts": 0, "total_runtime_seconds": 0, "last_restart_time": null } } ``` ### Core Workflow Phases | Phase | Actions | Output | |-------|---------|--------| | INITIALIZE | Check state, parse requirements | state.json, features.json | | DESIGN | Detect tech stack, choose architecture | architecture.md | | IMPLEMENT | Write code per feature | Source files | | TEST | Run unit/integration/E2E | Test results | | DEBUG | Apply 3-strike protocol | Fixes or escalation | | DEPLOY | Build, document, archive | Final deliverables | ### State File Schema ```json { "project_name": "string", "current_phase": "init|design|implement|test|deploy", "current_feature": "feature-id", "tech_stack": { "language": "string", "framework": "string", "runtime": "string" }, "completed_features": ["feat-001"], "pending_features": ["feat-002"], "session_count": 0, "last_activity": "ISO-8601-timestamp" } ``` ### 3-Strike Error Recovery ``` STRIKE 1: Direct Fix - Analyze error type and root cause - Apply known solution pattern - Run tests to verify STRIKE 2: Alternative Approach - Try different library/algorithm - Simplify implementation - Use different design pattern STRIKE 3: Architecture Rethink - Question design assumptions - Research alternatives - Consider partial implementation AFTER 3 STRIKES: Save checkpoint, request user guidance ``` ### Loop Prevention (Anti-Infinite-Loop) **⚠️ Critical: Prevents token waste in unattended operation** ``` DETECTION RULES: ┌─────────────────────────────────────────────────────────────────┐ │ Condition │ Threshold │ Action │ ├─────────────────────────────────────────────────────────────────┤ │ Same error repeated │ 3 times │ ESCALATE immediately│ │ Same file modified │ 5 times │ STOP, review approach│ │ Same command executed │ 3 times │ Try alternative │ │ No progress in N operations │ 10 ops │ PAUSE, reassess │ │ Single session too long │ 50 turns │ Checkpoint & pause │ └─────────────────────────────────────────────────────────────────┘ ``` **Loop Detection Algorithm:** ```python class LoopDetector: MAX_SAME_ERROR = 3 # Same error appears 3 times MAX_SAME_FILE_EDIT = 5 # Same file edited 5 times MAX_SAME_COMMAND = 3 # Same command run 3 times MAX_NO_PROGRESS = 10 # No feature completed in 10 ops MAX_SESSION_TURNS = 50 # Maximum turns per session def check_loop(self, state): # Check 1: Same error repeating if self.count_same_error(state.errors) >= self.MAX_SAME_ERROR: return LoopAlert("SAME_ERROR_LOOP", "Escalate to user") # Check 2: Same file being edited repeatedly if self.count_same_file_edits(state.recent_edits) >= self.MAX_SAME_FILE_EDIT: return LoopAlert("FILE_EDIT_LOOP", "Review approach") # Check 3: Same command executing repeatedly if self.count_same_commands(state.recent_commands) >= self.MAX_SAME_COMMAND: return LoopAlert("COMMAND_LOOP", "Try alternative") # Check 4: No progress indicator if self.count_operations_without_progress(state) >= self.MAX_NO_PROGRESS: return LoopAlert("NO_PROGRESS", "Reassess strategy") # Check 5: Session too long if state.session_turns >= self.MAX_SESSION_TURNS: return LoopAlert("SESSION_LIMIT", "Create checkpoint and pause") return None # No loop detected ``` **When Loop Detected - Escalation Protocol:** ```markdown ## LOOP ALERT: [Type] **Detected Pattern**: [What repeated] **Occurrences**: [Count] times **Time Spent**: [Duration] **Token Estimate**: [Approximate tokens used] **Actions Taken**: 1. Stopped current operation 2. Saved checkpoint to .builder/checkpoints/ 3. Logged loop pattern to .builder/loop-log.json **Status**: PAUSED - Awaiting user input **Options**: A) Skip this feature and continue with next B) Accept partial implementation C) Provide additional context/guidance D) Abort and generate report ``` **Loop State Tracking:** ```json { "loop_detection": { "error_history": [ {"error_hash": "abc123", "count": 2, "first_seen": "...", "last_seen": "..."} ], "file_edit_history": [ {"file": "src/app.py", "edit_count": 3, "last_edit": "..."} ], "command_history": [ {"command": "npm test", "run_count": 2, "last_run": "..."} ], "progress_check": { "operations_since_last_feature": 5, "last_completed_feature": "feat-002", "last_completion_time": "..." }, "session_metrics": { "start_time": "...", "turn_count": 25, "tokens_estimated": 50000 } } } ``` **Mandatory Break Points:** ``` After every 20 operations: └─ Check progress: Did any feature advance? ├─ YES: Continue └─ NO: Pause and reassess After every 10 minutes: └─ Review: Are we making meaningful progress? ├─ YES: Continue └─ NO: Checkpoint and evaluate On same error 2nd occurrence: └─ Warning: Same error detected, trying different approach └─ Log: Record pattern for analysis On same error 3rd occurrence: └─ STOP: Loop detected, escalate to user └─ Save: Create checkpoint before pause ``` ### File Writing Strategy For files > 500 lines, write in segments: ```python SEGMENT_SIZE = 200 # lines per segment # First segment: create file write_file(path, first_segment) # Subsequent segments: append edit_file(path, append=next_segment) ``` ### Technology Stack Detection ```python def detect_tech_stack(project_path): indicators = { 'python': ['requirements.txt', 'pyproject.toml', '*.py'], 'nodejs': ['package.json', '*.ts', '*.js'], 'rust': ['Cargo.toml', '*.rs'], 'go': ['go.mod', '*.go'], } # Auto-detect and return primary stack ``` ## Rules & Constraints ### MUST (Non-negotiable) - Create `.builder/` directory before any work - Update `state.json` after EVERY tool operation - Log ALL errors to `errors.json` with resolution attempts - Commit checkpoint after each feature completion - Use segmented writes for files > 500 lines - Run tests before marking feature complete ### SHOULD (Strong recommendations) - Follow existing project conventions - Use conventional commit messages - Create meaningful tests (not just coverage) - Document non-obvious decisions in `architecture.md` - Prefer simpler solutions over clever ones ### NEVER (Explicit prohibitions) - Delete user files without explicit permission - Overwrite existing code without backup - Commit secrets or credentials - Skip error handling - Make network calls without timeout - Create infinite loops without escape conditions ### SAFETY CRITICAL (System Protection - HIGHEST PRIORITY) **⚠️ These rules take precedence over ALL other operations. When in doubt, STOP and ASK.** **Operations requiring explicit user confirmation:** | Operation Type | Examples | Required Action | |---------------|----------|-----------------| | Files outside workspace | `C:\Windows\`, `/etc/`, `/usr/bin/` | STOP, warn user, get explicit approval | | System configuration | Registry edits, `/etc/hosts`, environment variables | STOP, explain risk, get approval | | Destructive operations | `rm -rf`, `format`, `DROP DATABASE` | STOP, show impact, get approval | | Network/firewall changes | Port binding, firewall rules | STOP, explain scope, get approval | | Package installation | `npm install -g`, `pip install --system` | Warn about system-wide changes | **Pre-execution safety checks:** ```markdown Before ANY operation, verify: 1. IS TARGET INSIDE WORKSPACE? ✅ Path starts with project root -> Proceed ⚠️ Path outside workspace -> STOP and confirm 2. IS OPERATION DESTRUCTIVE? ✅ Read/Write/Create in workspace -> Proceed ⚠️ Delete/Format/Truncate -> STOP and confirm 3. IS OPERATION SYSTEM-WIDE? ✅ Project-local operation -> Proceed ⚠️ Global install/System config -> STOP and confirm 4. COULD DATA BE LOST? ✅ New file creation -> Proceed ⚠️ Overwrite/Delete existing -> STOP and backup first ``` **Protected paths (NEVER modify without explicit approval):** ``` System directories: - Windows: C:\Windows\, C:\Program Files\, C:\Program Files (x86)\ - Linux: /etc/, /usr/, /var/, /root/, /home/ (other users) - macOS: /System/, /Library/, /Applications/ User data outside workspace: - Desktop, Documents, Downloads (outside project) - Any path containing "backup", "archive", "important" - Database files not in project directory - Configuration files: .bashrc, .zshrc, .gitconfig (global) ``` **Safe operation protocol:** ``` IF operation touches files outside workspace: 1. STOP execution immediately 2. Display warning to user: "⚠️ SAFETY ALERT: This operation affects files outside the workspace" - Target path: [full path] - Operation type: [read/write/delete] - Potential impact: [description] 3. Ask for explicit confirmation: "Do you want to proceed? This action cannot be undone." 4. If user declines -> Abort and suggest alternatives 5. If user approves -> Log the approval and proceed cautiously IF operation could cause data loss: 1. Create backup before proceeding 2. Log the operation to .builder/safety-log.json 3. Provide rollback instructions ``` **Data safety principles:** 1. **Preserve user data** - Never delete/overwrite without explicit consent 2. **Backup before destructive ops** - Create .backup/ if needed 3. **Workspace isolation** - All operations confined to project directory 4. **Fail-safe defaults** - When uncertain, choose the safer option 5. **Audit trail** - Log all potentially dangerous operations ## MCP Integration ### Puppeteer (Web Testing) ```markdown ## E2E Test Pattern 1. Launch browser: mcp__puppeteer_navigate 2. Interact: mcp__puppeteer_click, mcp__puppeteer_type 3. Verify: mcp__puppeteer_evaluate, mcp__puppeteer_screenshot 4. Cleanup: mcp__puppeteer_close ``` ### IDE Tools (Code Execution) ```markdown ## Code Execution Pattern 1. Write code to file 2. Execute: mcp__ide__executeCode 3. Check diagnostics: mcp__ide__getDiagnostics 4. Fix errors and retry ``` ## Workflow Reporting ### Overview Autonomous-builder now generates comprehensive workflow reports that document the entire development process, including user prompts, decisions, errors, and solutions. **Features**: - Automatic workflow logging during feature implementation - Unified report template compatible with commit-with-reflection - Detailed recording of user prompts and AI decisions - Integration with knowledge-steward for experience extraction - Pure Chinese reports for better readability ### Configuration **Project-level configuration** (`.claude-workflows.yaml`): ```yaml version: "1.0" enabled: true reporting: language: "zh-CN" detail_level: "detailed" output_dir: "docs/workflows" skills: autonomous-builder: workflow_reporting: true ``` **Builder-level configuration** (`.builder/config.yaml`): ```yaml workflow_reporting: enabled: true use_unified_template: true language: "zh-CN" detail_level: "detailed" record_all_tools: true record_decisions: true ``` ### Workflow Log Structure During feature implementation, autonomous-builder maintains a detailed log in `.builder/workflow-log.json`: ```json { "session_id": "session-2026-02-15-001", "feature_id": "feat-003", "start_time": "2026-02-15T14:00:00Z", "end_time": "2026-02-15T14:45:00Z", "user_prompts": [ { "timestamp": "2026-02-15T14:00:00Z", "prompt": "实现用户认证功能", "context": "用户希望添加JWT token验证" } ], "workflow_steps": [ { "step": 1, "action": "分析需求", "tool": "Read", "files": ["server/auth.ts"], "duration_seconds": 120 } ], "decisions": [ { "point": "选择认证方案", "options": ["JWT", "Session", "OAuth"], "chosen": "JWT", "reason": "无状态,适合API" } ], "errors": [ { "type": "TypeError", "message": "Cannot read property 'userId'", "solution": "更新User接口定义", "attempts": 2 } ] } ``` ### Report Generation (Step 8) After completing feature implementation and testing, autonomous-builder generates a workflow report: 1. **Read workflow log**: Load `.builder/workflow-log.json` 2. **Load template**: Use unified template from `docs/workflows/templates/unified-template.md` 3. **Fill template**: Populate all 12 sections with session data 4. **Save report**: Write to `docs/workflows/YYYY-MM/DD_workflow_[category]_[desc].md` 5. **Update index**: Regenerate `docs/workflows/INDEX.md` ### Report Structure The generated report includes 12 sections: 1. **概述** - Summary of the work 2. **用户需求与提示词** - User requirements and key prompts 3. **工作流记录** - Detailed workflow steps, decisions, and tools used 4. **修改内容** - Files modified and main changes 5. **遇到的错误** - Errors encountered with details 6. **根本原因分析** - Root cause analysis 7. **调试过程** - Debugging steps and iterations 8. **经验总结** - Key insights and prevention strategies 9. **知识提炼** - Reusable patterns and anti-patterns 10. **测试与验证** - Test cases and verification steps 11. **参考资料** - Related documentation and resources 12. **指标** - Metrics (errors, iterations, success rate, etc.) ### Updated Commit Message Format Commits now reference the workflow report: ``` feat: 实现用户认证功能 添加了JWT token验证和用户登录API端点。 工作流步骤: 8 决策点: 3 遇到错误: 2 调试迭代: 4 详见工作流报告: docs/workflows/2026-02/15_workflow_feature_user-auth.md Co-Authored-By: Claude Sonnet 4.5 ``` ### Integration with knowledge-steward Workflow reports can be analyzed by knowledge-steward to: - Extract effective prompts and interaction patterns - Identify reusable architectural patterns - Build a knowledge base of common errors and solutions - Generate experience summaries and best practices See `references/workflow-recording.md` for detailed implementation guide. ## GitHub Integration ### Overview Autonomous-builder integrates with GitHub for remote repository management, issue tracking, and release automation. **Features**: - Automatic push after each feature completion - GitHub Issues tracking for features - Release tags at milestones (25%, 50%, 75%, 100%) - Version rollback support via GitHub history ### Prerequisites **GitHub CLI (gh)**: ```bash # Windows winget install GitHub.cli # macOS brew install gh # Linux sudo apt install gh ``` **Authentication**: ```bash gh auth login gh auth status # Verify ``` ### Workflow Integration **Initializer Agent (Session 1)**: 1. Prompt for GitHub repository URL (optional) 2. Verify `gh auth status` 3. Set up remote: `git remote add origin ` 4. Create README.md and PLANNING.md 5. Initial commit and push to GitHub 6. Create GitHub issues for all features **Builder Agent (Sessions 2+)**: 1. Implement feature 2. Commit with issue reference: `Closes #N` 3. Push to GitHub: `git push origin main` 4. Update GitHub issue (auto-closed via commit) 5. Check milestone and create release tag if needed ### Commit Message Format ``` feat: 实现用户认证功能 添加了JWT token验证和用户登录API端点。 工作流步骤: 8 决策点: 3 遇到错误: 2 调试迭代: 4 详见工作流报告: docs/workflows/2026-02/15_workflow_feature_user-auth.md Closes #123 Co-Authored-By: Claude Sonnet 4.5 ``` ### Release Tags Automatic tags created at milestones: - **25% completion**: v0.1.0 (Foundation) - **50% completion**: v0.2.0 (Core Features) - **75% completion**: v0.3.0 (Advanced Features) - **100% completion**: v1.0.0 (Release) ### Error Handling - **Network failures**: 3 retries with 5s delay, then queue for next session - **Auth failures**: Disable GitHub integration, continue with local commits - **Push conflicts**: Auto-pull with rebase and retry ### Disabling GitHub Leave repository URL empty during initialization, or set `state.json → github.enabled = false`. ### Rollback ```bash # Rollback to previous feature git log --oneline git reset --hard git push --force origin main gh issue reopen # Rollback to release tag git checkout v0.1.0 git checkout -b rollback-to-v0.1.0 ``` **See**: `references/github-integration.md` for comprehensive documentation. ## Examples ### Example 1: New Project Creation **Input**: "Build a REST API for task management with Python FastAPI" **Steps**: 1. Initialize `.builder/` with state.json 2. Analyze requirements -> Generate features.json: ```json { "features": [ {"id": "feat-001", "name": "Project Setup", "status": "pending"}, {"id": "feat-002", "name": "Database Models", "status": "pending"}, {"id": "feat-003", "name": "CRUD Endpoints", "status": "pending"}, {"id": "feat-004", "name": "Authentication", "status": "pending"}, {"id": "feat-005", "name": "API Tests", "status": "pending"} ] } ``` 3. Create architecture.md with FastAPI patterns 4. Implement feature by feature 5. Test each feature before moving to next 6. Generate final documentation ### Example 2: Resume Interrupted Project **Input**: User starts new session, `.builder/state.json` exists **Steps**: 1. Read state.json -> Get current phase and feature 2. Read features.json -> Get feature status 3. Resume from last checkpoint 4. Continue implementation ### Example 3: Bug Fix Request **Input**: "Fix the authentication bug in my FastAPI app" **Steps**: 1. Detect existing project structure 2. Read relevant code files 3. Identify bug using systematic-debugging patterns 4. Apply fix with 3-strike protocol 5. Run tests to verify fix 6. Update state and commit ## References ### Official Architecture Patterns (Anthropic claude-quickstarts) - `references/two-agent-architecture.md`: **CRITICAL** - Two-Agent pattern for long-running tasks, fresh context per session - `references/think-tool.md`: **CRITICAL** - Think Tool for complex reasoning before action - `references/multi-layer-security.md`: **CRITICAL** - Defense in depth security architecture ### Core Capabilities - `references/safety-protocols.md`: **CRITICAL** - System protection and safe operation protocols - `references/loop-prevention.md`: **CRITICAL** - Anti-infinite-loop detection and token management - `references/session-continuity.md`: **CRITICAL** - Auto-resume and continuous operation across sessions - `references/skill-scheduling.md`: **CRITICAL** - Automatic skill discovery, planning, and dispatch - `references/mcp-auto-integration.md`: **CRITICAL** - MCP auto-discovery, installation, and human-like computer control - `references/github-integration.md`: **NEW** - GitHub integration for remote push, issue tracking, and release automation ### Implementation Guides - `references/index.md`: Navigation for all reference docs - `references/architecture-patterns.md`: Clean Architecture, Hexagonal, DDD - `references/multi-language.md`: Language-specific patterns (Python, Node.js, Go, Rust) - `references/error-recovery.md`: Detailed error handling strategies - `references/mcp-integration.md`: MCP tool usage guide - `references/testing-patterns.md`: Unit, integration, E2E testing ## Plugin 智能发现与自动使用 (ToolSearch Auto-Discovery) ### 核心原则 autonomous-builder 在执行任务时,**必须主动使用 ToolSearch** 动态发现并调用可用的 MCP 插件工具。这是对现有 MCP Auto-Integration 的升级,从静态配置变为运行时动态发现。 ### 会话启动时自动发现 ``` ON SESSION START (Step 0 - 在 Step 1 之前执行): 1. 使用 ToolSearch 探测所有可用插件: - ToolSearch("+playwright") → 浏览器自动化工具 - ToolSearch("+github") → GitHub 操作工具 - ToolSearch("+serena") → 代码语义分析工具 - ToolSearch("context7") → 文档查询工具 - ToolSearch("getDiagnostics") → IDE 诊断工具 - ToolSearch("executeCode") → 代码执行工具 2. 构建能力矩阵并存入 .builder/state.json: { "discovered_plugins": { "playwright": true/false, "github_mcp": true/false, "serena": true/false, "context7": true/false, "ide_diagnostics": true/false, "ide_execute": true/false }, "last_discovery": "ISO-8601-timestamp" } 3. 根据发现的插件调整工作流策略 ``` ### 各步骤插件智能调用 | Builder Step | ToolSearch 查询 | 用途 | |-------------|----------------|------| | Step 1: Get Context | `ToolSearch("+serena get_symbols_overview")` | 语义级代码结构分析,比 ls/grep 更精确 | | Step 2: Start Server | `ToolSearch("+playwright navigate")` | 用 Playwright 代替 Puppeteer 验证服务 | | Step 3: Regression Check | `ToolSearch("getDiagnostics")` | IDE 诊断检查类型错误和 lint 问题 | | Step 4: Select Feature | `ToolSearch("context7")` | 查询相关库文档辅助实现决策 | | Step 5: Implement | `ToolSearch("+serena find_symbol")` | 精确定位需要修改的代码符号 | | Step 5: Implement | `ToolSearch("+serena replace_symbol_body")` | 语义级代码编辑 | | Step 6: Browser Test | `ToolSearch("+playwright snapshot")` | 获取页面快照进行 UI 验证 | | Step 6: Browser Test | `ToolSearch("+playwright click")` | 模拟用户交互 | | Step 7: Update Status | `ToolSearch("+github update_issue")` | 更新 GitHub Issue 状态 | | Step 8: Report | `ToolSearch("+github create_or_update_file")` | 直接推送报告到 GitHub | | Step 9: Git Push | `ToolSearch("+github push_files")` | 通过 MCP 推送代码 | ### 实现阶段的智能插件选择 ``` DURING FEATURE IMPLEMENTATION: 1. 代码分析阶段: IF serena 可用: → ToolSearch("+serena find_symbol") 定位目标符号 → ToolSearch("+serena find_referencing_symbols") 分析影响范围 → ToolSearch("+serena get_symbols_overview") 理解文件结构 ELSE: → 回退到 Grep + Read 方式 2. 代码编辑阶段: IF serena 可用: → ToolSearch("+serena replace_symbol_body") 精确替换符号 → ToolSearch("+serena insert_after_symbol") 插入新代码 ELSE: → 回退到 Edit 工具 3. 测试阶段: IF playwright 可用: → ToolSearch("+playwright navigate") 打开应用 → ToolSearch("+playwright snapshot") 获取页面状态 → ToolSearch("+playwright click") 模拟交互 → ToolSearch("+playwright browser_evaluate") 执行 JS 验证 ELSE IF puppeteer 可用: → 使用 puppeteer MCP 工具 ELSE: → 回退到 Bash 执行测试命令 4. 文档查询阶段: IF context7 可用: → ToolSearch("context7") 查询库文档 → 获取最新 API 用法和最佳实践 ELSE: → 使用 WebSearch/WebFetch 5. 代码质量检查: IF ide_diagnostics 可用: → ToolSearch("getDiagnostics") 获取诊断 → 在提交前修复所有错误和警告 ELSE: → 使用 Bash 运行 linter/type-checker ``` ### 与现有 MCP Auto-Integration 的关系 ``` 旧方式 (静态): ON SESSION START → 运行 /mcp → 解析工具列表 → 硬编码工具名 新方式 (动态 ToolSearch): ON NEED → ToolSearch(关键词) → 发现工具 → 立即使用 优势: - 无需预先知道工具名称 - 自动适应不同环境的插件配置 - 按需加载,减少上下文占用 - 关键词搜索比精确名称更灵活 ``` ### 注意事项 - ToolSearch 返回的工具**立即可用**,无需再次 select - 关键词搜索已加载工具后,**不要**重复用 `select:` 加载 - 优先使用 MCP 工具而非 Bash 命令 - 如果 ToolSearch 未找到相关工具,回退到原有方式 - 将插件发现结果缓存到 state.json,避免重复探测 - 每个新会话重新探测一次(插件配置可能变化) ## Maintenance - Sources: Anthropic agent patterns, claude-skills best practices - Last updated: 2026-02-16 - Version: 2.0 (添加 ToolSearch 插件智能发现) - Known limits: Cannot handle hardware-dependent code, GPU computing without setup ## Quality Gate Before marking project complete: 1. [ ] All features in features.json have status "complete" 2. [ ] All tests pass (check features.json test counts) 3. [ ] No uncommitted changes 4. [ ] Documentation generated 5. [ ] State archived to `.builder/archive/`