---
name: goal-seeking-agent-pattern
version: 1.0.0
description: |
  Guides architects on when and how to use goal-seeking agents as a design pattern.
  This skill helps evaluate whether autonomous agents are appropriate for a given
  problem, how to structure their objectives, integrate with goal_agent_generator,
  and reference real amplihack examples like AKS SRE automation, CI diagnostics,
  pre-commit workflows, and fix-agent pattern matching.
auto-detection:
  triggers:
    - "complex workflow"
    - "autonomous agent"
    - "goal-seeking"
    - "adaptive behavior"
    - "multi-phase processing"
    - "task automation design"
    - "autonomous decision-making"
    - "multi-step process"
    - "workflow orchestration"
    - "self-directed agent"
allowed-tools: ["Read", "Grep", "Glob", "WebSearch"]
target-agents: ["architect"]
priority: "medium"
complexity: "medium"
---

# Goal-Seeking Agent Pattern Skill

## 1. What Are Goal-Seeking Agents?

Goal-seeking agents are autonomous AI agents that execute multi-phase objectives by:

1. **Understanding High-Level Goals**: Accept natural language objectives without explicit step-by-step instructions
2. **Planning Execution**: Break goals into phases with dependencies and success criteria
3. **Autonomous Execution**: Make decisions and adapt behavior based on intermediate results
4. **Self-Assessment**: Evaluate progress against success criteria and adjust approach
5. **Resilient Operation**: Handle failures gracefully and explore alternative solutions

### Core Characteristics

**Autonomy**: Agents decide HOW to achieve goals, not just follow prescriptive steps

**Adaptability**: Adjust strategy based on runtime conditions and intermediate results

**Goal-Oriented**: Focus on outcomes (what to achieve) rather than procedures (how to achieve)

**Multi-Phase**: Complex objectives decomposed into manageable phases with dependencies

**Self-Monitoring**: Track progress, detect failures, and course-correct autonomously

### Distinction from Traditional Agents

| Traditional Agent             | Goal-Seeking Agent            |
| ----------------------------- | ----------------------------- |
| Follows fixed workflow        | Adapts workflow to context    |
| Prescriptive steps            | Outcome-oriented objectives   |
| Human intervention on failure | Autonomous recovery attempts  |
| Single-phase execution        | Multi-phase with dependencies |
| Rigid decision tree           | Dynamic strategy adjustment   |

### When Goal-Seeking Makes Sense

Goal-seeking agents excel when:

- **Problem space is large**: Many possible paths to success
- **Context varies**: Runtime conditions affect optimal approach
- **Failures are expected**: Need autonomous recovery without human intervention
- **Objectives are clear**: Success criteria well-defined but path is flexible
- **Multi-step complexity**: Requires coordination across phases with dependencies

### When to Avoid Goal-Seeking

Use traditional agents or scripts when:

- **Single deterministic path**: Only one way to achieve goal
- **Latency-critical**: Need fastest possible execution (no decision overhead)
- **Safety-critical**: Human verification required at each step
- **Simple workflow**: Complexity of goal-seeking exceeds benefit
- **Audit requirements**: Need deterministic, reproducible execution

## 2. When to Use This Pattern

### Problem Indicators

Use goal-seeking agents when you observe these patterns:

#### Pattern 1: Workflow Variability

**Indicators**:

- Same objective requires different approaches based on context
- Manual decisions needed at multiple points
- "It depends" answers when mapping workflow

**Example**: Release workflow that varies by:

- Environment (staging vs production)
- Change type (hotfix vs feature)
- Current system state (healthy vs degraded)

**Solution**: Goal-seeking agent evaluates context and adapts workflow

#### Pattern 2: Multi-Phase Complexity

**Indicators**:

- Objective requires 3-5+ distinct phases
- Phases have dependencies (output of phase N feeds phase N+1)
- Parallel execution opportunities exist
- Success criteria differ per phase

**Example**: Data pipeline with phases:

1. Data collection (multiple sources, parallel)
2. Transformation (depends on collection results)
3. Validation (depends on transformation output)
4. Publishing (conditional on validation pass)

**Solution**: Goal-seeking agent orchestrates phases, handles dependencies

#### Pattern 3: Autonomous Recovery Needed

**Indicators**:

- Failures are expected and recoverable
- Multiple retry/fallback strategies exist
- Human intervention is expensive or slow
- Can verify success programmatically

**Example**: CI diagnostic workflow:

- Test failures (retry with different approach)
- Environment issues (reconfigure and retry)
- Dependency conflicts (resolve and rerun)

**Solution**: Goal-seeking agent tries strategies until success or escalation

#### Pattern 4: Adaptive Decision Making

**Indicators**:

- Need to evaluate trade-offs at runtime
- Multiple valid solutions with different characteristics
- Optimization objectives (speed vs quality vs cost)
- Context-dependent best practices

**Example**: Fix agent pattern matching:

- QUICK mode for obvious issues
- DIAGNOSTIC mode for unclear problems
- COMPREHENSIVE mode for complex solutions

**Solution**: Goal-seeking agent selects strategy based on problem analysis

#### Pattern 5: Domain Expertise Required

**Indicators**:

- Requires specialized knowledge to execute
- Multiple domain-specific tools/approaches
- Best practices vary by domain
- Coordination of specialized sub-agents

**Example**: AKS SRE automation:

- Azure-specific operations (ARM, CLI)
- Kubernetes expertise (kubectl, YAML)
- Networking knowledge (CNI, ingress)
- Security practices (RBAC, Key Vault)

**Solution**: Goal-seeking agent with domain expertise coordinates specialized actions

### Decision Framework

Use this 5-question framework to evaluate goal-seeking applicability:

#### Question 1: Is the objective well-defined but path flexible?

**YES if**:

- Clear success criteria exist
- Multiple valid approaches
- Runtime context affects optimal path

**NO if**:

- Only one correct approach
- Path is deterministic
- Success criteria ambiguous

**Example YES**: "Ensure AKS cluster is production-ready" (many paths, clear criteria)
**Example NO**: "Run specific kubectl command" (one path, prescriptive)

#### Question 2: Are there multiple phases with dependencies?

**YES if**:

- Objective naturally decomposes into 3-5+ phases
- Phase outputs feed subsequent phases
- Some phases can execute in parallel
- Failures in one phase affect downstream phases

**NO if**:

- Single-phase execution sufficient
- No inter-phase dependencies
- Purely sequential with no branching

**Example YES**: Data pipeline (collect → transform → validate → publish)
**Example NO**: Format code with ruff (single atomic operation)

#### Question 3: Is autonomous recovery valuable?

**YES if**:

- Failures are common and expected
- Multiple recovery strategies exist
- Human intervention is expensive/slow
- Can verify success automatically

**NO if**:

- Failures are rare edge cases
- Manual investigation always required
- Safety-critical (human verification needed)
- Cannot verify success programmatically

**Example YES**: CI diagnostic workflow (try multiple fix strategies)
**Example NO**: Deploy to production (human approval required)

#### Question 4: Does context significantly affect approach?

**YES if**:

- Environment differences change strategy
- Current system state affects decisions
- Trade-offs vary by situation (speed vs quality vs cost)
- Domain-specific best practices apply

**NO if**:

- Same approach works for all contexts
- No environmental dependencies
- No trade-off decisions needed

**Example YES**: Fix agent (quick vs diagnostic vs comprehensive based on issue)
**Example NO**: Generate UUID (context-independent)

#### Question 5: Is the complexity justified?

**YES if**:

- Problem is repeated frequently (2+ times/week)
- Manual execution takes 30+ minutes
- High value from automation
- Maintenance cost is acceptable

**NO if**:

- One-off or rare problem
- Quick manual execution (< 5 minutes)
- Simple script suffices
- Maintenance cost exceeds benefit

**Example YES**: CI failure diagnosis (frequent, time-consuming, high value)
**Example NO**: One-time data migration (rare, script sufficient)

### Decision Matrix

| All 5 YES | Use Goal-Seeking Agent |
| 4 YES, 1 NO | Probably use Goal-Seeking Agent |
| 3 YES, 2 NO | Consider simpler agent or hybrid |
| 2 YES, 3 NO | Traditional agent likely better |
| 0-1 YES | Script or simple automation |

## 3. Architecture Pattern

### Component Architecture

Goal-seeking agents have four core components:

```python
# Component 1: Goal Definition
class GoalDefinition:
    """Structured representation of objective"""
    raw_prompt: str              # Natural language goal
    goal: str                    # Extracted primary objective
    domain: str                  # Problem domain (security, data, automation, etc.)
    constraints: list[str]       # Technical/operational constraints
    success_criteria: list[str]  # How to verify success
    complexity: str              # simple, moderate, complex
    context: dict                # Additional metadata

# Component 2: Execution Plan
class ExecutionPlan:
    """Multi-phase plan with dependencies"""
    goal_id: uuid.UUID
    phases: list[PlanPhase]
    total_estimated_duration: str
    required_skills: list[str]
    parallel_opportunities: list[list[str]]  # Phases that can run parallel
    risk_factors: list[str]

# Component 3: Plan Phase
class PlanPhase:
    """Individual phase in execution plan"""
    name: str
    description: str
    required_capabilities: list[str]
    estimated_duration: str
    dependencies: list[str]          # Names of prerequisite phases
    parallel_safe: bool              # Can execute in parallel
    success_indicators: list[str]    # How to verify phase completion

# Component 4: Skill Definition
class SkillDefinition:
    """Capability needed for execution"""
    name: str
    description: str
    capabilities: list[str]
    implementation_type: str  # "native" or "delegated"
    delegation_target: str    # Agent to delegate to
```

### Execution Flow

```
┌─────────────────────────────────────────────────────────────┐
│ 1. GOAL ANALYSIS                                            │
│                                                             │
│  Input: Natural language objective                         │
│  Process: Extract goal, domain, constraints, criteria      │
│  Output: GoalDefinition                                    │
│                                                             │
│  [PromptAnalyzer.analyze_text(prompt)]                    │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ 2. PLANNING                                                 │
│                                                             │
│  Input: GoalDefinition                                     │
│  Process: Decompose into phases, identify dependencies     │
│  Output: ExecutionPlan                                     │
│                                                             │
│  [ObjectivePlanner.generate_plan(goal_definition)]        │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ 3. SKILL SYNTHESIS                                          │
│                                                             │
│  Input: ExecutionPlan                                      │
│  Process: Map capabilities to skills, identify agents      │
│  Output: list[SkillDefinition]                            │
│                                                             │
│  [SkillSynthesizer.synthesize(execution_plan)]            │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ 4. AGENT ASSEMBLY                                           │
│                                                             │
│  Input: GoalDefinition, ExecutionPlan, Skills              │
│  Process: Combine into executable bundle                   │
│  Output: GoalAgentBundle                                   │
│                                                             │
│  [AgentAssembler.assemble(goal, plan, skills)]            │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ 5. EXECUTION (Auto-Mode)                                    │
│                                                             │
│  Input: GoalAgentBundle                                    │
│  Process: Execute phases, monitor progress, adapt          │
│  Output: Success or escalation                             │
│                                                             │
│  [Auto-mode with initial_prompt from bundle]              │
└─────────────────────────────────────────────────────────────┘
```

### Phase Dependency Management

Phases can have three relationship types:

**Sequential Dependency**: Phase B depends on Phase A completion

```
Phase A → Phase B → Phase C
```

**Parallel Execution**: Phases can run concurrently

```
Phase A ──┬→ Phase B ──┐
          └→ Phase C ──┴→ Phase D
```

**Conditional Branching**: Phase selection based on results

```
Phase A → [Decision] → Phase B (success path)
                    └→ Phase C (recovery path)
```

### State Management

Goal-seeking agents maintain state across phases:

```python
class AgentState:
    """Runtime state for goal-seeking agent"""
    current_phase: str
    completed_phases: list[str]
    phase_results: dict[str, Any]  # Output from each phase
    failures: list[FailureRecord]  # Track what didn't work
    retry_count: int
    total_duration: timedelta
    context: dict                  # Shared context across phases
```

### Error Handling

Three error recovery strategies:

**Retry with Backoff**: Same approach, exponential delay

```python
for attempt in range(MAX_RETRIES):
    try:
        result = execute_phase(phase)
        break
    except RetryableError as e:
        wait_time = INITIAL_DELAY * (2 ** attempt)
        sleep(wait_time)
```

**Alternative Strategy**: Different approach to same goal

```python
for strategy in STRATEGIES:
    try:
        result = execute_phase(phase, strategy)
        break
    except StrategyFailedError:
        continue  # Try next strategy
else:
    escalate_to_human("All strategies exhausted")
```

**Graceful Degradation**: Accept partial success

```python
try:
    result = execute_phase_optimal(phase)
except OptimalFailedError:
    result = execute_phase_fallback(phase)  # Lower quality but works
```

## 4. Integration with goal_agent_generator

The `goal_agent_generator` module provides the implementation for goal-seeking agents. Here's how to integrate:

### Core API

```python
from amplihack.goal_agent_generator import (
    PromptAnalyzer,
    ObjectivePlanner,
    SkillSynthesizer,
    AgentAssembler,
    GoalAgentPackager,
)

# Step 1: Analyze natural language goal
analyzer = PromptAnalyzer()
goal_definition = analyzer.analyze_text("""
Automate AKS cluster production readiness verification.
Check security, networking, monitoring, and compliance.
Generate report with actionable recommendations.
""")

# Step 2: Generate execution plan
planner = ObjectivePlanner()
execution_plan = planner.generate_plan(goal_definition)

# Step 3: Synthesize required skills
synthesizer = SkillSynthesizer()
skills = synthesizer.synthesize(execution_plan)

# Step 4: Assemble complete agent
assembler = AgentAssembler()
agent_bundle = assembler.assemble(
    goal_definition=goal_definition,
    execution_plan=execution_plan,
    skills=skills,
    bundle_name="aks-readiness-checker"
)

# Step 5: Package for deployment
packager = GoalAgentPackager()
packager.package(
    bundle=agent_bundle,
    output_dir=".claude/agents/goal-driven/aks-readiness-checker"
)
```

### CLI Integration

```bash
# Generate agent from prompt file
amplihack goal-agent-generator create \
  --prompt ./prompts/aks-readiness.md \
  --output .claude/agents/goal-driven/aks-readiness-checker

# Generate agent from inline prompt
amplihack goal-agent-generator create \
  --inline "Automate CI failure diagnosis and fix iteration" \
  --output .claude/agents/goal-driven/ci-fixer

# List generated agents
amplihack goal-agent-generator list

# Test agent execution
amplihack goal-agent-generator test \
  --agent-path .claude/agents/goal-driven/ci-fixer \
  --dry-run
```

### PromptAnalyzer Details

Extracts structured information from natural language:

```python
from amplihack.goal_agent_generator import PromptAnalyzer
from pathlib import Path

analyzer = PromptAnalyzer()

# From file
goal_def = analyzer.analyze(Path("./prompts/my-goal.md"))

# From text
goal_def = analyzer.analyze_text("Deploy and monitor microservices to AKS")

# GoalDefinition contains:
print(goal_def.goal)               # "Deploy and monitor microservices to AKS"
print(goal_def.domain)             # "deployment"
print(goal_def.constraints)        # ["Zero downtime", "Rollback capability"]
print(goal_def.success_criteria)   # ["All pods running", "Metrics visible"]
print(goal_def.complexity)         # "moderate"
print(goal_def.context)            # {"priority": "high", "scale": "medium"}
```

Domain classification:

- `data-processing`: Data transformation, analysis, ETL
- `security-analysis`: Vulnerability scanning, audits
- `automation`: Workflow automation, scheduling
- `testing`: Test generation, validation
- `deployment`: Release, publishing, distribution
- `monitoring`: Observability, alerting
- `integration`: API connections, webhooks
- `reporting`: Dashboards, metrics, summaries

Complexity determination:

- `simple`: Single-phase, < 50 words, basic operations
- `moderate`: 2-4 phases, 50-150 words, some coordination
- `complex`: 5+ phases, > 150 words, sophisticated orchestration

### ObjectivePlanner Details

Generates multi-phase execution plans:

```python
from amplihack.goal_agent_generator import ObjectivePlanner

planner = ObjectivePlanner()
plan = planner.generate_plan(goal_definition)

# ExecutionPlan contains:
for i, phase in enumerate(plan.phases, 1):
    print(f"Phase {i}: {phase.name}")
    print(f"  Description: {phase.description}")
    print(f"  Duration: {phase.estimated_duration}")
    print(f"  Capabilities: {', '.join(phase.required_capabilities)}")
    print(f"  Dependencies: {', '.join(phase.dependencies)}")
    print(f"  Parallel Safe: {phase.parallel_safe}")
    print(f"  Success Indicators: {phase.success_indicators}")

print(f"\nTotal Duration: {plan.total_estimated_duration}")
print(f"Required Skills: {', '.join(plan.required_skills)}")
print(f"Parallel Opportunities: {plan.parallel_opportunities}")
print(f"Risk Factors: {plan.risk_factors}")
```

Phase templates by domain:

- **data-processing**: Collection → Transformation → Analysis → Reporting
- **security-analysis**: Reconnaissance → Vulnerability Detection → Risk Assessment → Reporting
- **automation**: Setup → Workflow Design → Execution → Validation
- **testing**: Test Planning → Implementation → Execution → Results Analysis
- **deployment**: Pre-deployment → Deployment → Verification → Post-deployment
- **monitoring**: Setup Monitors → Data Collection → Analysis → Alerting

### SkillSynthesizer Details

Maps capabilities to skills:

```python
from amplihack.goal_agent_generator import SkillSynthesizer

synthesizer = SkillSynthesizer()
skills = synthesizer.synthesize(execution_plan)

# list[SkillDefinition]
for skill in skills:
    print(f"Skill: {skill.name}")
    print(f"  Description: {skill.description}")
    print(f"  Capabilities: {', '.join(skill.capabilities)}")
    print(f"  Type: {skill.implementation_type}")
    if skill.implementation_type == "delegated":
        print(f"  Delegates to: {skill.delegation_target}")
```

Capability mapping:

- `data-*` → `data-processor` skill
- `security-*`, `vulnerability-*` → `security-analyzer` skill
- `test-*` → `tester` skill
- `deploy-*` → `deployer` skill
- `monitor-*`, `alert-*` → `monitor` skill
- `report-*`, `document-*` → `documenter` skill

### AgentAssembler Details

Combines components into executable bundle:

```python
from amplihack.goal_agent_generator import AgentAssembler

assembler = AgentAssembler()
bundle = assembler.assemble(
    goal_definition=goal_definition,
    execution_plan=execution_plan,
    skills=skills,
    bundle_name="custom-agent"  # Optional, auto-generated if omitted
)

# GoalAgentBundle contains:
print(bundle.id)                    # UUID
print(bundle.name)                  # "custom-agent" or auto-generated
print(bundle.version)               # "1.0.0"
print(bundle.status)                # "ready"
print(bundle.auto_mode_config)      # Configuration for auto-mode execution
print(bundle.metadata)              # Domain, complexity, skills, etc.

# Auto-mode configuration
config = bundle.auto_mode_config
print(config["max_turns"])          # Based on complexity
print(config["initial_prompt"])     # Generated execution prompt
print(config["success_criteria"])   # From goal definition
print(config["constraints"])        # From goal definition
```

Auto-mode configuration:

- `max_turns`: 5 (simple), 10 (moderate), 15 (complex), +20% per extra phase
- `initial_prompt`: Full markdown prompt with goal, plan, success criteria
- `working_dir`: Current directory
- `sdk`: "claude" (default)
- `ui_mode`: False (headless by default)

### GoalAgentPackager Details

Packages bundle for deployment:

```python
from amplihack.goal_agent_generator import GoalAgentPackager
from pathlib import Path

packager = GoalAgentPackager()
packager.package(
    bundle=agent_bundle,
    output_dir=Path(".claude/agents/goal-driven/my-agent")
)

# Creates:
# .claude/agents/goal-driven/my-agent/
# ├── agent.md           # Agent definition
# ├── prompt.md          # Initial prompt
# ├── metadata.json      # Bundle metadata
# ├── plan.yaml          # Execution plan
# └── skills.yaml        # Required skills
```

## 5. Recent Amplihack Examples

Real goal-seeking agents from the amplihack project:

### Example 1: AKS SRE Automation (Issue #1293)

**Problem**: Manual AKS cluster operations are time-consuming and error-prone

**Goal-Seeking Solution**:

```python
# Goal: Automate AKS production readiness verification
goal = """
Verify AKS cluster production readiness:
- Security: RBAC, network policies, Key Vault integration
- Networking: Ingress, DNS, load balancers
- Monitoring: Container Insights, alerts, dashboards
- Compliance: Azure Policy, resource quotas
Generate actionable report with recommendations.
"""

# Agent decomposes into phases:
# 1. Security Audit (parallel): RBAC check, network policies, Key Vault
# 2. Networking Validation (parallel): Ingress test, DNS resolution, LB health
# 3. Monitoring Verification (parallel): Metrics, logs, alerts configured
# 4. Compliance Check (depends on 1-3): Azure Policy, quotas, best practices
# 5. Report Generation (depends on 4): Markdown report with findings

# Agent adapts based on findings:
# - If security issues found: Suggest fixes, offer to apply
# - If monitoring missing: Generate alert templates
# - If compliance violations: List remediation steps
```

**Key Characteristics**:

- **Autonomous**: Checks multiple systems without step-by-step instructions
- **Adaptive**: Investigation depth varies by findings
- **Multi-Phase**: Parallel security/networking/monitoring, sequential reporting
- **Domain Expert**: Azure + Kubernetes knowledge embedded
- **Self-Assessing**: Validates each check, aggregates results

**Implementation**:

```python
# Located in: .claude/agents/amplihack/specialized/azure-kubernetes-expert.md
# Uses knowledge base: .claude/data/azure_aks_expert/

# Integrates with goal_agent_generator:
from amplihack.goal_agent_generator import (
    PromptAnalyzer, ObjectivePlanner, AgentAssembler
)

analyzer = PromptAnalyzer()
goal_def = analyzer.analyze_text(goal)

planner = ObjectivePlanner()
plan = planner.generate_plan(goal_def)  # Generates 5-phase plan

# Domain-specific customization:
plan.phases[0].required_capabilities = [
    "rbac-audit", "network-policy-check", "key-vault-integration"
]
```

**Lessons Learned**:

- Domain expertise critical for complex infrastructure
- Parallel execution significantly reduces total time
- Actionable recommendations increase agent value
- Comprehensive knowledge base (Q&A format) enables autonomous decisions

### Example 2: CI Diagnostic Workflow

**Problem**: CI failures require manual diagnosis and fix iteration

**Goal-Seeking Solution**:

```python
# Goal: Diagnose CI failure and iterate fixes until success
goal = """
CI pipeline failing after push.
Diagnose failures, apply fixes, push updates, monitor CI.
Iterate until all checks pass.
Stop at mergeable state without auto-merging.
"""

# Agent decomposes into phases:
# 1. CI Status Monitoring: Check current CI state
# 2. Failure Diagnosis: Analyze logs, compare environments
# 3. Fix Application: Apply fixes based on failure patterns
# 4. Push and Wait: Commit fixes, push, wait for CI re-run
# 5. Success Verification: Confirm all checks pass

# Iterative loop:
# Phases 2-4 repeat until success or max iterations (5)
```

**Key Characteristics**:

- **Iterative**: Repeats fix cycle until success
- **Autonomous Recovery**: Tries multiple fix strategies
- **State Management**: Tracks attempted fixes, avoids repeating failures
- **Pattern Matching**: Recognizes common CI failure types
- **Escalation**: Reports to user after max iterations

**Implementation**:

```python
# Located in: .claude/agents/amplihack/specialized/ci-diagnostic-workflow.md

# Fix iteration loop:
MAX_ITERATIONS = 5
iteration = 0

while iteration < MAX_ITERATIONS:
    status = check_ci_status()

    if status["conclusion"] == "success":
        break

    # Diagnose failures
    failures = analyze_ci_logs(status)

    # Apply pattern-matched fixes
    for failure in failures:
        if "test" in failure["type"]:
            fix_test_failure(failure)
        elif "lint" in failure["type"]:
            fix_lint_failure(failure)
        elif "type" in failure["type"]:
            fix_type_failure(failure)

    # Commit and push
    git_commit_and_push(f"fix: CI iteration {iteration + 1}")

    # Wait for CI re-run
    wait_for_ci_completion()

    iteration += 1

if iteration >= MAX_ITERATIONS:
    escalate_to_user("CI still failing after 5 iterations")
```

**Lessons Learned**:

- Iteration limits prevent infinite loops
- Pattern matching (test/lint/type) enables targeted fixes
- Smart waiting (exponential backoff) reduces wait time
- Never auto-merge: human approval always required

### Example 3: Pre-Commit Diagnostic Workflow

**Problem**: Pre-commit hooks fail with unclear errors

**Goal-Seeking Solution**:

```python
# Goal: Fix pre-commit hook failures before commit
goal = """
Pre-commit hooks failing.
Diagnose issues (formatting, linting, type checking).
Apply fixes locally, re-run hooks.
Ensure all hooks pass before allowing commit.
"""

# Agent decomposes into phases:
# 1. Hook Failure Analysis: Identify which hooks failed
# 2. Environment Check: Compare local vs pre-commit versions
# 3. Targeted Fixes: Apply fixes per hook type
# 4. Hook Re-run: Validate fixes, iterate if needed
# 5. Commit Readiness: Confirm all hooks pass
```

**Key Characteristics**:

- **Pre-Push Focus**: Fixes issues before pushing to CI
- **Tool Version Management**: Ensures local matches pre-commit config
- **Hook-Specific Fixes**: Tailored approach per hook type
- **Fast Iteration**: No wait for CI, immediate feedback

**Implementation**:

```python
# Located in: .claude/agents/amplihack/specialized/pre-commit-diagnostic.md

# Hook failure patterns:
HOOK_FIXES = {
    "ruff": lambda: subprocess.run(["ruff", "check", "--fix", "."]),
    "black": lambda: subprocess.run(["black", "."]),
    "mypy": lambda: add_type_ignores(),
    "trailing-whitespace": lambda: subprocess.run(["pre-commit", "run", "trailing-whitespace", "--all-files"]),
}

# Execution:
failed_hooks = detect_failed_hooks()

for hook in failed_hooks:
    if hook in HOOK_FIXES:
        HOOK_FIXES[hook]()
    else:
        generic_fix(hook)

# Re-run to verify
rerun_result = subprocess.run(["pre-commit", "run", "--all-files"])
if rerun_result.returncode == 0:
    print("All hooks passing, ready to commit!")
```

**Lessons Learned**:

- Pre-commit fixes are faster than CI iteration
- Tool version mismatches are common culprit
- Automated fixes for 80% of cases
- Remaining 20% escalate with clear diagnostics

### Example 4: Fix-Agent Pattern Matching

**Problem**: Different issues require different fix approaches

**Goal-Seeking Solution**:

```python
# Goal: Select optimal fix strategy based on problem context
goal = """
Analyze issue and select fix mode:
- QUICK: Obvious fixes (< 5 min)
- DIAGNOSTIC: Unclear root cause (investigation)
- COMPREHENSIVE: Complex issues (full workflow)
"""

# Agent decomposes into phases:
# 1. Issue Analysis: Classify problem type and complexity
# 2. Mode Selection: Choose QUICK/DIAGNOSTIC/COMPREHENSIVE
# 3. Fix Execution: Apply mode-appropriate strategy
# 4. Validation: Verify fix resolves issue
```

**Key Characteristics**:

- **Context-Aware**: Selects strategy based on problem analysis
- **Multi-Mode**: Three fix modes for different complexity levels
- **Pattern Recognition**: Learns from past fixes
- **Adaptive**: Escalates complexity if initial mode fails

**Implementation**:

```python
# Located in: .claude/agents/amplihack/specialized/fix-agent.md

# Mode selection logic:
def select_fix_mode(issue: Issue) -> FixMode:
    if issue.is_obvious() and issue.scope == "single-file":
        return FixMode.QUICK
    elif issue.root_cause_unclear():
        return FixMode.DIAGNOSTIC
    elif issue.is_complex() or issue.requires_architecture_change():
        return FixMode.COMPREHENSIVE
    else:
        return FixMode.DIAGNOSTIC  # Default to investigation

# Pattern frequency (from real usage):
FIX_PATTERNS = {
    "import": 0.15,      # Import errors (15%)
    "config": 0.12,      # Configuration issues (12%)
    "test": 0.18,        # Test failures (18%)
    "ci": 0.20,          # CI/CD problems (20%)
    "quality": 0.25,     # Code quality (linting, types) (25%)
    "logic": 0.10,       # Logic errors (10%)
}

# Template-based fixes for common patterns:
if issue.pattern == "import":
    apply_template("import-fix-template", issue)
elif issue.pattern == "config":
    apply_template("config-fix-template", issue)
# ... etc
```

**Lessons Learned**:

- Pattern matching enables template-based fixes (80% coverage)
- Mode selection reduces over-engineering (right-sized approach)
- Diagnostic mode critical for unclear issues (root cause analysis)
- Usage data informs template priorities

## 6. Design Checklist

Use this checklist when designing goal-seeking agents:

### Goal Definition

- [ ] Objective is clear and well-defined
- [ ] Success criteria are measurable and verifiable
- [ ] Constraints are explicit (time, resources, safety)
- [ ] Domain is identified (impacts phase templates)
- [ ] Complexity is estimated (simple/moderate/complex)

### Phase Design

- [ ] Decomposed into 3-5 phases (not too granular, not too coarse)
- [ ] Phase dependencies are explicit
- [ ] Parallel execution opportunities identified
- [ ] Each phase has clear success indicators
- [ ] Phase durations are estimated

### Skill Mapping

- [ ] Required capabilities identified per phase
- [ ] Skills mapped to existing agents or tools
- [ ] Delegation targets specified
- [ ] No missing capabilities

### Error Handling

- [ ] Retry strategies defined (max attempts, backoff)
- [ ] Alternative strategies identified
- [ ] Escalation criteria clear (when to ask for help)
- [ ] Graceful degradation options (fallback approaches)

### State Management

- [ ] State tracked across phases
- [ ] Phase results stored for downstream use
- [ ] Failure history maintained
- [ ] Context shared appropriately

### Testing

- [ ] Success scenarios tested
- [ ] Failure recovery tested
- [ ] Edge cases identified
- [ ] Performance validated (duration, resource usage)

### Documentation

- [ ] Goal clearly documented
- [ ] Phase descriptions complete
- [ ] Usage examples provided
- [ ] Integration points specified

### Philosophy Compliance

- [ ] Ruthless simplicity (no unnecessary complexity)
- [ ] Single responsibility per phase
- [ ] No over-engineering (right-sized solution)
- [ ] Regeneratable (clear specifications)

## 7. Agent SDK Integration (Future)

When the Agent SDK Skill is integrated, goal-seeking agents can leverage:

### Enhanced Autonomy

```python
# Agent SDK provides enhanced context management
from claude_agent_sdk import AgentContext, Tool

class GoalSeekingAgent:
    def __init__(self, context: AgentContext):
        self.context = context
        self.state = {}

    async def execute_phase(self, phase: PlanPhase):
        # SDK provides tools, memory, delegation
        tools = self.context.get_tools(phase.required_capabilities)
        memory = self.context.get_memory()

        # Execute with SDK support
        result = await phase.execute(tools, memory)

        # Store in context for downstream phases
        self.context.store_result(phase.name, result)
```

### Tool Discovery

```python
# SDK enables dynamic tool discovery
available_tools = context.discover_tools(capability="data-processing")

# Select optimal tool for task
tool = context.select_tool(
    capability="data-transformation",
    criteria={"performance": "high", "accuracy": "required"}
)
```

### Memory Management

```python
# SDK provides persistent memory across sessions
context.memory.store("deployment-history", deployment_record)
previous = context.memory.retrieve("deployment-history")

# Enables learning from past executions
if previous and previous.failed:
    # Avoid previous failure strategy
    strategy = select_alternative_strategy(previous.failure_reason)
```

### Agent Delegation

```python
# SDK simplifies agent-to-agent delegation
result = await context.delegate(
    agent="security-analyzer",
    task="audit-rbac-policies",
    input={"cluster": cluster_name}
)

# Parallel delegation
results = await context.delegate_parallel([
    ("security-analyzer", "audit-rbac-policies"),
    ("network-analyzer", "validate-ingress"),
    ("monitoring-validator", "check-metrics")
])
```

### Observability

```python
# SDK provides built-in tracing and metrics
with context.trace("data-transformation"):
    result = transform_data(input_data)

context.metrics.record("transformation-duration", duration)
context.metrics.record("transformation-accuracy", accuracy)
```

### Integration Example

```python
from claude_agent_sdk import AgentContext, create_agent
from amplihack.goal_agent_generator import GoalAgentBundle

# Create SDK-enabled goal-seeking agent
def create_goal_agent(bundle: GoalAgentBundle) -> Agent:
    context = AgentContext(
        name=bundle.name,
        version=bundle.version,
        capabilities=bundle.metadata["required_capabilities"]
    )

    # Register phases as agent tasks
    for phase in bundle.execution_plan.phases:
        context.register_task(
            name=phase.name,
            capabilities=phase.required_capabilities,
            executor=create_phase_executor(phase)
        )

    # Create agent with SDK
    agent = create_agent(context)

    # Execute goal
    return agent

# Usage:
agent = create_goal_agent(agent_bundle)
result = await agent.execute(bundle.auto_mode_config["initial_prompt"])
```

## 8. Trade-Off Analysis

### Goal-Seeking vs Traditional Agents

| Dimension            | Goal-Seeking Agent                    | Traditional Agent         |
| -------------------- | ------------------------------------- | ------------------------- |
| **Flexibility**      | High - adapts to context              | Low - fixed workflow      |
| **Development Time** | Moderate - define goals & phases      | Low - script steps        |
| **Execution Time**   | Higher - decision overhead            | Lower - direct execution  |
| **Maintenance**      | Lower - self-adapting                 | Higher - manual updates   |
| **Debuggability**    | Harder - dynamic behavior             | Easier - predictable flow |
| **Reusability**      | High - same agent, different contexts | Low - context-specific    |
| **Failure Handling** | Autonomous recovery                   | Manual intervention       |
| **Complexity**       | Higher - multi-phase coordination     | Lower - linear execution  |

### When to Choose Each

**Choose Goal-Seeking when**:

- Problem space is large with many valid approaches
- Context varies significantly across executions
- Autonomous recovery is valuable
- Reusability across contexts is important
- Development time investment is justified

**Choose Traditional when**:

- Single deterministic path exists
- Performance is critical (low latency required)
- Simplicity is paramount
- One-off or rare execution
- Debugging and auditability are critical

### Cost-Benefit Analysis

**Goal-Seeking Costs**:

- Higher development time (define goals, phases, capabilities)
- Increased execution time (decision overhead)
- More complex testing (dynamic behavior)
- Harder debugging (non-deterministic paths)

**Goal-Seeking Benefits**:

- Autonomous operation (less human intervention)
- Adaptive to context (works in varied conditions)
- Reusable across problems (same agent, different goals)
- Self-recovering (handles failures gracefully)

**Break-Even Point**: Goal-seeking justified when problem is:

- Repeated 2+ times per week, OR
- Takes 30+ minutes manual execution, OR
- Requires expert knowledge hard to document, OR
- High value from autonomous recovery

## 9. When to Escalate

Goal-seeking agents should escalate to humans when:

### Hard Limits Reached

**Max Iterations Exceeded**:

```python
if iteration_count >= MAX_ITERATIONS:
    escalate(
        reason="Reached maximum iterations without success",
        context={
            "iterations": iteration_count,
            "attempted_strategies": attempted_strategies,
            "last_error": last_error
        }
    )
```

**Timeout Exceeded**:

```python
if elapsed_time > MAX_DURATION:
    escalate(
        reason="Execution time exceeded limit",
        context={
            "elapsed": elapsed_time,
            "max_allowed": MAX_DURATION,
            "completed_phases": completed_phases
        }
    )
```

### Safety Boundaries

**Destructive Operations**:

```python
if operation.is_destructive() and not operation.has_approval():
    escalate(
        reason="Destructive operation requires human approval",
        operation=operation.description,
        impact=operation.estimate_impact()
    )
```

**Production Changes**:

```python
if target_environment == "production":
    escalate(
        reason="Production deployments require human verification",
        changes=proposed_changes,
        rollback_plan=rollback_strategy
    )
```

### Uncertainty Detection

**Low Confidence**:

```python
if decision_confidence < CONFIDENCE_THRESHOLD:
    escalate(
        reason="Confidence below threshold for autonomous decision",
        decision=decision_description,
        confidence=decision_confidence,
        alternatives=alternative_options
    )
```

**Conflicting Strategies**:

```python
if len(viable_strategies) > 1 and not clear_winner:
    escalate(
        reason="Multiple viable strategies, need human judgment",
        strategies=viable_strategies,
        trade_offs=strategy_trade_offs
    )
```

### Unexpected Conditions

**Unrecognized Errors**:

```python
if error_type not in KNOWN_ERROR_PATTERNS:
    escalate(
        reason="Encountered unknown error pattern",
        error=error_details,
        context=execution_context,
        recommendation="Manual investigation required"
    )
```

**Environment Mismatch**:

```python
if detected_environment != expected_environment:
    escalate(
        reason="Environment mismatch detected",
        expected=expected_environment,
        detected=detected_environment,
        risk="Potential for incorrect behavior"
    )
```

### Escalation Best Practices

**Provide Context**:

- What was attempted
- What failed and why
- What alternatives were considered
- Current system state

**Suggest Actions**:

- Recommend next steps
- Provide diagnostic commands
- Offer manual intervention points
- Suggest rollback if needed

**Enable Recovery**:

- Save execution state
- Document failures
- Provide resume capability
- Offer manual override

**Example Escalation**:

```python
escalate(
    reason="CI failure diagnosis unsuccessful after 5 iterations",
    context={
        "iterations": 5,
        "attempted_fixes": [
            "Import path corrections (iteration 1)",
            "Type annotation fixes (iteration 2)",
            "Test environment setup (iteration 3)",
            "Dependency version pins (iteration 4)",
            "Mock configuration (iteration 5)"
        ],
        "persistent_failures": [
            "test_integration.py::test_api_connection - Timeout",
            "test_models.py::test_validation - Assertion error"
        ],
        "system_state": "2 of 25 tests still failing",
        "ci_logs": "https://github.com/.../actions/runs/123456"
    },
    recommendations=[
        "Review test_api_connection timeout - may need increased timeout or mock",
        "Examine test_validation assertion - data structure may have changed",
        "Consider running tests locally with same environment as CI",
        "Check if recent changes affected integration test setup"
    ],
    next_steps={
        "manual_investigation": "Run failing tests locally with verbose output",
        "rollback_option": "git revert HEAD~5 if fixes made things worse",
        "resume_point": "Fix failures and run /amplihack:ci-diagnostic to resume"
    }
)
```

## 10. Example Workflow

Complete example: Building a goal-seeking agent for data pipeline automation

### Step 1: Define Goal

```markdown
# Goal: Automate Multi-Source Data Pipeline

## Objective

Collect data from multiple sources (S3, database, API), transform to common schema, validate quality, publish to data warehouse.

## Success Criteria

- All sources successfully ingested
- Data transformed to target schema
- Quality checks pass (completeness, accuracy)
- Data published to warehouse
- Pipeline completes within 30 minutes

## Constraints

- Must handle source unavailability gracefully
- No data loss (failed records logged)
- Idempotent (safe to re-run)
- Resource limits: 8GB RAM, 4 CPU cores

## Context

- Daily execution (automated schedule)
- Priority: High (blocking downstream analytics)
- Scale: Medium (100K-1M records per source)
```

### Step 2: Analyze with PromptAnalyzer

```python
from amplihack.goal_agent_generator import PromptAnalyzer

analyzer = PromptAnalyzer()
goal_definition = analyzer.analyze_text(goal_text)

# Result:
# goal_definition.goal = "Automate Multi-Source Data Pipeline"
# goal_definition.domain = "data-processing"
# goal_definition.complexity = "moderate"
# goal_definition.constraints = [
#     "Must handle source unavailability gracefully",
#     "No data loss (failed records logged)",
#     "Idempotent (safe to re-run)",
#     "Resource limits: 8GB RAM, 4 CPU cores"
# ]
# goal_definition.success_criteria = [
#     "All sources successfully ingested",
#     "Data transformed to target schema",
#     "Quality checks pass (completeness, accuracy)",
#     "Data published to warehouse",
#     "Pipeline completes within 30 minutes"
# ]
```

### Step 3: Generate Plan with ObjectivePlanner

```python
from amplihack.goal_agent_generator import ObjectivePlanner

planner = ObjectivePlanner()
execution_plan = planner.generate_plan(goal_definition)

# Result: 4-phase plan
# Phase 1: Data Collection (parallel)
#   - Collect from S3 (parallel-safe)
#   - Collect from database (parallel-safe)
#   - Collect from API (parallel-safe)
#   Duration: 15 minutes
#   Success: All sources attempted, failures logged
#
# Phase 2: Data Transformation (depends on Phase 1)
#   - Parse raw data
#   - Transform to common schema
#   - Handle missing fields
#   Duration: 15 minutes
#   Success: All records transformed or logged as failed
#
# Phase 3: Quality Validation (depends on Phase 2)
#   - Completeness check
#   - Accuracy validation
#   - Consistency verification
#   Duration: 5 minutes
#   Success: Quality thresholds met
#
# Phase 4: Data Publishing (depends on Phase 3)
#   - Load to warehouse
#   - Update metadata
#   - Generate report
#   Duration: 10 minutes
#   Success: Data in warehouse, report generated
```

### Step 4: Synthesize Skills

```python
from amplihack.goal_agent_generator import SkillSynthesizer

synthesizer = SkillSynthesizer()
skills = synthesizer.synthesize(execution_plan)

# Result: 3 skills
# Skill 1: data-collector
#   Capabilities: ["s3-read", "database-query", "api-fetch"]
#   Implementation: "native" (built-in)
#
# Skill 2: data-transformer
#   Capabilities: ["parsing", "schema-mapping", "validation"]
#   Implementation: "native" (built-in)
#
# Skill 3: data-publisher
#   Capabilities: ["warehouse-load", "metadata-update", "reporting"]
#   Implementation: "delegated" (delegates to warehouse tool)
```

### Step 5: Assemble Agent

```python
from amplihack.goal_agent_generator import AgentAssembler

assembler = AgentAssembler()
agent_bundle = assembler.assemble(
    goal_definition=goal_definition,
    execution_plan=execution_plan,
    skills=skills,
    bundle_name="multi-source-data-pipeline"
)

# Result: GoalAgentBundle
# - Name: multi-source-data-pipeline
# - Max turns: 12 (moderate complexity, 4 phases)
# - Initial prompt: Full execution plan with phases
# - Status: "ready"
```

### Step 6: Package Agent

```python
from amplihack.goal_agent_generator import GoalAgentPackager
from pathlib import Path

packager = GoalAgentPackager()
packager.package(
    bundle=agent_bundle,
    output_dir=Path(".claude/agents/goal-driven/multi-source-data-pipeline")
)

# Creates agent package:
# .claude/agents/goal-driven/multi-source-data-pipeline/
# ├── agent.md           # Agent definition
# ├── prompt.md          # Execution prompt
# ├── metadata.json      # Bundle metadata
# ├── plan.yaml          # Execution plan (4 phases)
# └── skills.yaml        # 3 required skills
```

### Step 7: Execute Agent (Auto-Mode)

```bash
# Execute via CLI
amplihack goal-agent-generator execute \
  --agent-path .claude/agents/goal-driven/multi-source-data-pipeline \
  --auto-mode \
  --max-turns 12

# Or programmatically:
```

```python
from claude_code import execute_auto_mode

result = execute_auto_mode(
    initial_prompt=agent_bundle.auto_mode_config["initial_prompt"],
    max_turns=agent_bundle.auto_mode_config["max_turns"],
    working_dir=agent_bundle.auto_mode_config["working_dir"]
)
```

### Step 8: Monitor Execution

Agent executes autonomously:

```
Phase 1: Data Collection [In Progress]
├── S3 Collection: ✓ COMPLETED (50K records, 5 minutes)
├── Database Collection: ✓ COMPLETED (75K records, 8 minutes)
└── API Collection: ✗ FAILED (timeout, retrying...)
    └── Retry 1: ✓ COMPLETED (25K records, 4 minutes)

Phase 1: ✓ COMPLETED (150K records total, 3 sources, 17 minutes)

Phase 2: Data Transformation [In Progress]
├── Parsing: ✓ COMPLETED (150K records parsed)
├── Schema Mapping: ✓ COMPLETED (148K records mapped, 2K failed)
└── Missing Fields: ✓ COMPLETED (defaults applied)

Phase 2: ✓ COMPLETED (148K records ready, 2K logged as failed, 12 minutes)

Phase 3: Quality Validation [In Progress]
├── Completeness: ✓ PASS (98.7% complete, threshold 95%)
├── Accuracy: ✓ PASS (99.2% accurate, threshold 98%)
└── Consistency: ✓ PASS (100% consistent)

Phase 3: ✓ COMPLETED (All checks passed, 4 minutes)

Phase 4: Data Publishing [In Progress]
├── Warehouse Load: ✓ COMPLETED (148K records loaded)
├── Metadata Update: ✓ COMPLETED (pipeline_run_id: 12345)
└── Report Generation: ✓ COMPLETED (report.html)

Phase 4: ✓ COMPLETED (Data published, 8 minutes)

Total Execution: ✓ SUCCESS (41 minutes, all success criteria met)
```

### Step 9: Review Results

```markdown
# Pipeline Execution Report

## Summary

- **Status**: SUCCESS
- **Duration**: 41 minutes (estimated: 30 minutes)
- **Records Processed**: 150K ingested, 148K published
- **Success Rate**: 98.7%

## Phase Results

### Phase 1: Data Collection

- S3: 50K records (5 min)
- Database: 75K records (8 min)
- API: 25K records (4 min, 1 retry)

### Phase 2: Data Transformation

- Successfully transformed: 148K records
- Failed transformations: 2K records (logged to failed_records.log)
- Failure reasons: Schema mismatch (1.5K), Invalid data (500)

### Phase 3: Quality Validation

- Completeness: 98.7% ✓
- Accuracy: 99.2% ✓
- Consistency: 100% ✓

### Phase 4: Data Publishing

- Warehouse load: Success
- Pipeline run ID: 12345
- Report: report.html

## Issues Encountered

1. API timeout (Phase 1): Resolved with retry
2. 2K transformation failures: Logged for manual review

## Recommendations

1. Investigate schema mismatches in API data
2. Add validation for API data format
3. Consider increasing timeout for API calls
```

### Step 10: Iteration (If Needed)

If pipeline fails, agent adapts:

```python
# Example: API source completely unavailable
if phase1_result["api"]["status"] == "unavailable":
    # Agent adapts: continues with partial data
    log_warning("API source unavailable, continuing with S3 + database")
    proceed_to_phase2_with_partial_data()

    # Report notes partial data
    add_to_report("Data incomplete: API source unavailable")

# Example: Quality validation fails
if phase3_result["completeness"] < THRESHOLD:
    # Agent tries recovery: fetch missing data
    missing_records = identify_missing_records()
    retry_collection_for_missing(missing_records)
    rerun_transformation()
    rerun_validation()

    # If still fails after retry, escalate
    if still_below_threshold:
        escalate("Quality threshold not met after retry")
```

## 11. Related Patterns

Goal-seeking agents relate to and integrate with other patterns:

### Debate Pattern (Multi-Agent Decision Making)

**When to Combine**:

- Goal-seeking agent faces complex decision with trade-offs
- Multiple valid approaches exist
- Need consensus from different perspectives

**Example**:

```python
# Goal-seeking agent reaches decision point
if len(viable_strategies) > 1:
    # Invoke debate pattern
    result = invoke_debate(
        question="Which data transformation approach?",
        perspectives=["performance", "accuracy", "simplicity"],
        context=current_state
    )

    # Use debate result to select strategy
    selected_strategy = result.consensus
```

### N-Version Pattern (Redundant Implementation)

**When to Combine**:

- Goal-seeking agent executing critical phase
- Error cost is high
- Multiple independent implementations possible

**Example**:

```python
# Critical security validation phase
if phase.is_critical():
    # Generate N versions
    results = generate_n_versions(
        phase=phase,
        n=3,
        independent=True
    )

    # Use voting or comparison to select result
    validated_result = compare_and_validate(results)
```

### Cascade Pattern (Fallback Strategies)

**When to Combine**:

- Goal-seeking agent has preferred approach but needs fallbacks
- Quality/performance trade-offs exist
- Graceful degradation desired

**Example**:

```python
# Data transformation with fallback
try:
    # Optimal: ML-based transformation
    result = ml_transform(data)
except MLModelUnavailable:
    try:
        # Pragmatic: Rule-based transformation
        result = rule_based_transform(data)
    except RuleEngineError:
        # Minimal: Manual templates
        result = template_transform(data)
```

### Investigation Workflow (Knowledge Discovery)

**When to Combine**:

- Goal requires understanding existing system
- Need to discover architecture or patterns
- Knowledge excavation before execution

**Example**:

```python
# Before automating deployment, understand current system
if goal.requires_system_knowledge():
    # Run investigation workflow
    investigation = run_investigation_workflow(
        scope="deployment pipeline",
        depth="comprehensive"
    )

    # Use findings to inform goal-seeking execution
    adapt_plan_based_on_investigation(investigation.findings)
```

### Document-Driven Development (Specification First)

**When to Combine**:

- Goal-seeking agent generates or modifies code
- Clear specifications prevent drift
- Documentation is single source of truth

**Example**:

```python
# Goal: Implement new feature
if goal.involves_code_changes():
    # DDD Phase 1: Generate specifications
    specs = generate_specifications(goal)

    # DDD Phase 2: Review and approve specs
    await human_review(specs)

    # Goal-seeking agent implements from specs
    implementation = execute_from_specifications(specs)
```

### Pre-Commit / CI Diagnostic (Quality Gates)

**When to Combine**:

- Goal-seeking agent makes code changes
- Need to ensure quality before commit/push
- Automated validation and fixes

**Example**:

```python
# After goal-seeking agent generates code
if changes_made:
    # Run pre-commit diagnostic
    pre_commit_result = run_pre_commit_diagnostic()

    if pre_commit_result.has_failures():
        # Agent fixes issues
        apply_pre_commit_fixes(pre_commit_result.failures)

    # After push, run CI diagnostic
    ci_result = run_ci_diagnostic_workflow()

    if ci_result.has_failures():
        # Agent iterates fixes
        iterate_ci_fixes_until_pass(ci_result)
```

## 12. Quality Standards

Goal-seeking agents must meet these quality standards:

### Correctness

**Success Criteria Verification**:

- [ ] Agent verifies all success criteria before completion
- [ ] Intermediate phase results validated
- [ ] No silent failures (all errors logged and handled)

**Testing Coverage**:

- [ ] Happy path tested (all success criteria met)
- [ ] Failure scenarios tested (phase failures, retries)
- [ ] Edge cases identified and tested
- [ ] Integration with real systems validated

### Resilience

**Error Handling**:

- [ ] Retry logic with exponential backoff
- [ ] Alternative strategies for common failures
- [ ] Graceful degradation when optimal path unavailable
- [ ] Clear escalation criteria

**State Management**:

- [ ] State persisted across phase boundaries
- [ ] Resume capability after failures
- [ ] Idempotent execution (safe to re-run)
- [ ] Cleanup on abort

### Performance

**Efficiency**:

- [ ] Phases execute in parallel when possible
- [ ] No unnecessary work (skip completed phases on retry)
- [ ] Resource usage within limits (memory, CPU, time)
- [ ] Timeout limits enforced

**Latency**:

- [ ] Decision overhead acceptable for use case
- [ ] No blocking waits (async where possible)
- [ ] Progress reported (no black box periods)

### Observability

**Logging**:

- [ ] Phase transitions logged
- [ ] Decisions logged with reasoning
- [ ] Errors logged with context
- [ ] Results logged with metrics

**Metrics**:

- [ ] Duration per phase tracked
- [ ] Success/failure rates tracked
- [ ] Resource usage monitored
- [ ] Quality metrics reported

**Tracing**:

- [ ] Execution flow traceable
- [ ] Correlations across phases maintained
- [ ] Debugging information sufficient

### Usability

**Documentation**:

- [ ] Goal clearly stated
- [ ] Success criteria documented
- [ ] Usage examples provided
- [ ] Integration guide complete

**User Experience**:

- [ ] Clear progress reporting
- [ ] Actionable error messages
- [ ] Human-readable outputs
- [ ] Easy to invoke and monitor

### Philosophy Compliance

**Ruthless Simplicity**:

- [ ] No unnecessary phases or complexity
- [ ] Simplest approach that works
- [ ] No premature optimization

**Single Responsibility**:

- [ ] Each phase has one clear job
- [ ] No overlapping responsibilities
- [ ] Clean phase boundaries

**Modularity**:

- [ ] Skills are reusable across agents
- [ ] Phases are independent
- [ ] Clear interfaces (inputs/outputs)

**Regeneratable**:

- [ ] Can be rebuilt from specifications
- [ ] No hardcoded magic values
- [ ] Configuration externalized

## 13. Getting Started

### Quick Start: Build Your First Goal-Seeking Agent

**Step 1**: Install amplihack (if not already)

```bash
pip install amplihack
```

**Step 2**: Write a goal prompt

```bash
cat > my-goal.md << 'EOF'
# Goal: Automated Security Audit

Check application for common security issues:
- SQL injection vulnerabilities
- XSS vulnerabilities
- Insecure dependencies
- Missing security headers

Generate report with severity levels and remediation steps.
EOF
```

**Step 3**: Generate agent

```bash
amplihack goal-agent-generator create \
  --prompt my-goal.md \
  --output .claude/agents/goal-driven/security-auditor
```

**Step 4**: Review generated plan

```bash
cat .claude/agents/goal-driven/security-auditor/plan.yaml
```

**Step 5**: Execute agent

```bash
amplihack goal-agent-generator execute \
  --agent-path .claude/agents/goal-driven/security-auditor \
  --auto-mode
```

### Common Use Cases

**Use Case 1: Workflow Automation**

```bash
# Create release automation agent
echo "Automate release workflow: tag, build, test, deploy to staging" | \
  amplihack goal-agent-generator create --inline --output .claude/agents/goal-driven/release-automator
```

**Use Case 2: Data Pipeline**

```bash
# Create ETL pipeline agent
echo "Extract from sources, transform to schema, validate quality, load to warehouse" | \
  amplihack goal-agent-generator create --inline --output .claude/agents/goal-driven/etl-pipeline
```

**Use Case 3: Diagnostic Workflow**

```bash
# Create performance diagnostic agent
echo "Diagnose application performance issues, identify bottlenecks, suggest optimizations" | \
  amplihack goal-agent-generator create --inline --output .claude/agents/goal-driven/perf-diagnostic
```

### Learning Resources

**Documentation**:

- Review examples in `~/.amplihack/.claude/skills/goal-seeking-agent-pattern/examples/`
- Read real agent implementations in `~/.amplihack/.claude/agents/amplihack/specialized/`
- Check integration guide in `~/.amplihack/.claude/skills/goal-seeking-agent-pattern/templates/integration_guide.md`

**Practice**:

1. Start simple: Build single-phase agent (e.g., file formatter)
2. Add complexity: Build multi-phase agent (e.g., test generator + runner)
3. Add autonomy: Build agent with error recovery (e.g., CI fixer)
4. Build production: Build full goal-seeking agent (e.g., deployment pipeline)

**Get Help**:

- Review decision framework (Section 2)
- Check design checklist (Section 6)
- Study real examples (Section 5)
- Ask architect agent for guidance

### Next Steps

After building your first goal-seeking agent:

1. **Test thoroughly**: Cover success, failure, and edge cases
2. **Monitor in production**: Track metrics, logs, failures
3. **Iterate**: Refine based on real usage
4. **Document learnings**: Update DISCOVERIES.md with insights
5. **Share patterns**: Add successful approaches to PATTERNS.md

**Success Indicators**:

- Agent completes goal autonomously 80%+ of time
- Failures escalate with clear context
- Execution time is acceptable
- Users trust agent to run autonomously

---

**Remember**: Goal-seeking agents should be ruthlessly simple, focused on clear objectives, and adaptive to context. Start simple, add complexity only when justified, and always verify against success criteria.