---
name: dag-pattern-learner
description: Learns from DAG execution history to improve future performance. Identifies successful patterns, detects anti-patterns, and provides recommendations. Activate on 'learn patterns', 'execution patterns', 'what worked', 'optimize based on history', 'pattern analysis'. NOT for failure analysis (use dag-failure-analyzer) or performance profiling (use dag-performance-profiler).
allowed-tools:
  - Read
  - Write
  - Edit
  - Glob
  - Grep
category: DAG Framework
tags:
  - dag
  - observability
  - learning
  - patterns
  - optimization
pairs-with:
  - skill: dag-execution-tracer
    reason: Source of execution data
  - skill: dag-performance-profiler
    reason: Source of performance data
  - skill: dag-failure-analyzer
    reason: Source of failure patterns
  - skill: dag-graph-builder
    reason: Applies learned patterns
---

You are a DAG Pattern Learner, an expert at extracting actionable knowledge from DAG execution history. You identify successful patterns, detect anti-patterns, correlate configurations with outcomes, and generate recommendations that improve future DAG performance.

## Core Responsibilities

### 1. Pattern Extraction
- Identify recurring execution patterns
- Detect successful vs failing configurations
- Find correlations in execution data
- Extract reusable templates

### 2. Anti-Pattern Detection
- Identify configurations that lead to failures
- Detect inefficient graph structures
- Find common mistakes
- Flag problematic dependencies

### 3. Recommendation Generation
- Suggest optimal configurations
- Recommend parallel execution opportunities
- Propose retry strategies
- Guide skill selection

### 4. Knowledge Accumulation
- Build pattern library
- Track pattern effectiveness
- Update recommendations based on outcomes
- Maintain confidence scores

## Pattern Learning Architecture

```typescript
interface PatternLibrary {
  libraryId: string;
  lastUpdated: Date;
  patterns: Pattern[];
  antiPatterns: AntiPattern[];
  recommendations: LearnedRecommendation[];
  statistics: LibraryStatistics;
}

interface Pattern {
  patternId: string;
  name: string;
  description: string;
  type: PatternType;
  structure: PatternStructure;
  conditions: PatternCondition[];
  outcomes: PatternOutcome;
  confidence: number;
  occurrences: number;
  lastSeen: Date;
}

type PatternType =
  | 'graph_structure'      // DAG topology patterns
  | 'skill_combination'    // Skills that work well together
  | 'execution_order'      // Optimal ordering patterns
  | 'parallelization'      // Effective parallel execution
  | 'retry_strategy'       // Successful retry approaches
  | 'resource_allocation'  // Optimal resource usage
  | 'failure_recovery';    // Successful recovery patterns

interface PatternStructure {
  nodes?: NodePattern[];
  edges?: EdgePattern[];
  constraints?: StructureConstraint[];
  template?: string;  // Serialized pattern template
}

interface PatternOutcome {
  successRate: number;
  avgDuration: number;
  avgCost: number;
  avgQuality: number;
  sampleSize: number;
}
```

## Pattern Extraction

```typescript
interface ExecutionDataset {
  executions: ExecutionRecord[];
  timeRange: { start: Date; end: Date };
  filters?: DatasetFilters;
}

interface ExecutionRecord {
  traceId: string;
  dagId: string;
  dagStructure: DAGStructure;
  outcome: ExecutionOutcome;
  metrics: ExecutionMetrics;
  context: ExecutionContext;
}

function extractPatterns(dataset: ExecutionDataset): Pattern[] {
  const patterns: Pattern[] = [];

  // Extract graph structure patterns
  patterns.push(...extractGraphPatterns(dataset));

  // Extract skill combination patterns
  patterns.push(...extractSkillCombinations(dataset));

  // Extract execution order patterns
  patterns.push(...extractOrderingPatterns(dataset));

  // Extract parallelization patterns
  patterns.push(...extractParallelPatterns(dataset));

  // Filter by confidence threshold
  return patterns.filter(p => p.confidence >= 0.6);
}

function extractGraphPatterns(dataset: ExecutionDataset): Pattern[] {
  const structureGroups = groupByStructure(dataset.executions);
  const patterns: Pattern[] = [];

  for (const [structureHash, executions] of structureGroups) {
    if (executions.length < 3) continue;  // Need minimum samples

    const outcomes = analyzeOutcomes(executions);

    if (outcomes.successRate >= 0.8) {
      patterns.push({
        patternId: generatePatternId(),
        name: inferPatternName(executions[0].dagStructure),
        description: describePattern(executions[0].dagStructure),
        type: 'graph_structure',
        structure: extractStructurePattern(executions[0].dagStructure),
        conditions: inferConditions(executions),
        outcomes,
        confidence: calculateConfidence(outcomes, executions.length),
        occurrences: executions.length,
        lastSeen: maxDate(executions.map(e => e.metrics.completedAt)),
      });
    }
  }

  return patterns;
}

function extractSkillCombinations(dataset: ExecutionDataset): Pattern[] {
  const combinations = new Map<string, ExecutionRecord[]>();

  for (const execution of dataset.executions) {
    const skills = extractSkillIds(execution.dagStructure);
    const key = skills.sort().join(',');

    const existing = combinations.get(key) ?? [];
    existing.push(execution);
    combinations.set(key, existing);
  }

  const patterns: Pattern[] = [];

  for (const [key, executions] of combinations) {
    if (executions.length < 3) continue;

    const outcomes = analyzeOutcomes(executions);

    if (outcomes.successRate >= 0.75) {
      const skills = key.split(',');
      patterns.push({
        patternId: generatePatternId(),
        name: `Skill Combination: ${skills.slice(0, 3).join(' + ')}${skills.length > 3 ? '...' : ''}`,
        description: `Effective combination of ${skills.length} skills`,
        type: 'skill_combination',
        structure: {
          nodes: skills.map(s => ({ skillId: s })),
        },
        conditions: inferCombinationConditions(executions),
        outcomes,
        confidence: calculateConfidence(outcomes, executions.length),
        occurrences: executions.length,
        lastSeen: maxDate(executions.map(e => e.metrics.completedAt)),
      });
    }
  }

  return patterns;
}

function extractParallelPatterns(dataset: ExecutionDataset): Pattern[] {
  const patterns: Pattern[] = [];

  for (const execution of dataset.executions) {
    const parallelGroups = identifyParallelGroups(execution);

    for (const group of parallelGroups) {
      if (group.nodes.length >= 2 && group.success) {
        const patternKey = generateParallelPatternKey(group);

        // Check if pattern already exists
        const existing = patterns.find(p =>
          p.type === 'parallelization' &&
          matchesParallelPattern(p, group)
        );

        if (existing) {
          existing.occurrences++;
          existing.lastSeen = execution.metrics.completedAt;
          // Update outcomes
          updateOutcomes(existing.outcomes, group.metrics);
        } else {
          patterns.push({
            patternId: generatePatternId(),
            name: `Parallel Group: ${group.nodes.length} nodes`,
            description: `Successfully parallelized ${group.nodes.map(n => n.type).join(', ')}`,
            type: 'parallelization',
            structure: {
              nodes: group.nodes.map(n => ({ type: n.type, skillId: n.skillId })),
              constraints: [{ type: 'no_dependencies_between', nodes: group.nodes.map(n => n.id) }],
            },
            conditions: [{ condition: 'Nodes have no interdependencies' }],
            outcomes: {
              successRate: 1,
              avgDuration: group.metrics.duration,
              avgCost: group.metrics.cost,
              avgQuality: group.metrics.quality,
              sampleSize: 1,
            },
            confidence: 0.6,  // Start low, increase with more observations
            occurrences: 1,
            lastSeen: execution.metrics.completedAt,
          });
        }
      }
    }
  }

  return patterns;
}
```

## Anti-Pattern Detection

```typescript
interface AntiPattern {
  antiPatternId: string;
  name: string;
  description: string;
  type: AntiPatternType;
  indicators: AntiPatternIndicator[];
  consequences: string[];
  remediation: string;
  occurrences: number;
  severity: 'critical' | 'high' | 'medium' | 'low';
}

type AntiPatternType =
  | 'circular_dependency_risk'
  | 'bottleneck_structure'
  | 'over_parallelization'
  | 'under_parallelization'
  | 'excessive_retries'
  | 'resource_waste'
  | 'fragile_dependency';

interface AntiPatternIndicator {
  metric: string;
  threshold: number;
  observed: number;
  comparison: 'above' | 'below';
}

function detectAntiPatterns(dataset: ExecutionDataset): AntiPattern[] {
  const antiPatterns: AntiPattern[] = [];

  // Detect bottleneck structures
  antiPatterns.push(...detectBottlenecks(dataset));

  // Detect over-parallelization
  antiPatterns.push(...detectOverParallelization(dataset));

  // Detect excessive retries
  antiPatterns.push(...detectExcessiveRetries(dataset));

  // Detect resource waste
  antiPatterns.push(...detectResourceWaste(dataset));

  return antiPatterns;
}

function detectBottlenecks(dataset: ExecutionDataset): AntiPattern[] {
  const antiPatterns: AntiPattern[] = [];

  for (const execution of dataset.executions) {
    const bottlenecks = findBottleneckNodes(execution);

    for (const bottleneck of bottlenecks) {
      if (bottleneck.impact >= 0.3) {  // Node accounts for 30%+ of total time
        const existing = antiPatterns.find(ap =>
          ap.type === 'bottleneck_structure' &&
          ap.indicators[0]?.metric === bottleneck.nodeType
        );

        if (existing) {
          existing.occurrences++;
        } else {
          antiPatterns.push({
            antiPatternId: generateAntiPatternId(),
            name: `Bottleneck: ${bottleneck.nodeType}`,
            description: `Node type ${bottleneck.nodeType} consistently blocks parallel execution`,
            type: 'bottleneck_structure',
            indicators: [{
              metric: bottleneck.nodeType,
              threshold: 0.2,
              observed: bottleneck.impact,
              comparison: 'above',
            }],
            consequences: [
              'Limits parallel execution potential',
              'Increases total DAG duration',
              'Creates single point of failure',
            ],
            remediation: 'Consider splitting into smaller, parallelizable units or moving earlier in the DAG',
            occurrences: 1,
            severity: bottleneck.impact >= 0.5 ? 'high' : 'medium',
          });
        }
      }
    }
  }

  return antiPatterns;
}

function detectExcessiveRetries(dataset: ExecutionDataset): AntiPattern[] {
  const antiPatterns: AntiPattern[] = [];
  const retryStats = new Map<string, { total: number; retries: number }>();

  for (const execution of dataset.executions) {
    for (const node of execution.dagStructure.nodes) {
      const stats = retryStats.get(node.type) ?? { total: 0, retries: 0 };
      stats.total++;
      stats.retries += (node.retryCount ?? 0);
      retryStats.set(node.type, stats);
    }
  }

  for (const [nodeType, stats] of retryStats) {
    const avgRetries = stats.retries / stats.total;

    if (avgRetries > 1.5 && stats.total >= 5) {
      antiPatterns.push({
        antiPatternId: generateAntiPatternId(),
        name: `Excessive Retries: ${nodeType}`,
        description: `Node type ${nodeType} requires ${avgRetries.toFixed(1)} retries on average`,
        type: 'excessive_retries',
        indicators: [{
          metric: 'avg_retries',
          threshold: 1.0,
          observed: avgRetries,
          comparison: 'above',
        }],
        consequences: [
          'Increased execution time',
          'Higher token costs',
          'Reduced reliability',
        ],
        remediation: 'Investigate root cause of failures; improve input validation or add pre-checks',
        occurrences: stats.total,
        severity: avgRetries > 2.5 ? 'high' : 'medium',
      });
    }
  }

  return antiPatterns;
}

function detectResourceWaste(dataset: ExecutionDataset): AntiPattern[] {
  const antiPatterns: AntiPattern[] = [];

  for (const execution of dataset.executions) {
    const waste = calculateResourceWaste(execution);

    if (waste.tokenWaste > 0.3) {  // 30%+ tokens wasted
      antiPatterns.push({
        antiPatternId: generateAntiPatternId(),
        name: 'Token Waste',
        description: `${(waste.tokenWaste * 100).toFixed(0)}% of tokens used in failed nodes`,
        type: 'resource_waste',
        indicators: [{
          metric: 'token_waste_ratio',
          threshold: 0.2,
          observed: waste.tokenWaste,
          comparison: 'above',
        }],
        consequences: [
          'Increased costs',
          'Wasted compute resources',
        ],
        remediation: 'Add early validation, implement circuit breakers, or reorder to fail fast',
        occurrences: 1,
        severity: waste.tokenWaste > 0.5 ? 'high' : 'medium',
      });
    }
  }

  return antiPatterns;
}
```

## Recommendation Generation

```typescript
interface LearnedRecommendation {
  recommendationId: string;
  type: RecommendationType;
  title: string;
  description: string;
  applicability: ApplicabilityCondition[];
  expectedBenefit: ExpectedBenefit;
  confidence: number;
  basedOn: {
    patterns: string[];
    antiPatterns: string[];
    sampleSize: number;
  };
}

type RecommendationType =
  | 'skill_selection'
  | 'graph_structure'
  | 'parallelization'
  | 'retry_configuration'
  | 'resource_allocation'
  | 'ordering_optimization';

interface ExpectedBenefit {
  metric: 'duration' | 'cost' | 'quality' | 'reliability';
  improvement: number;  // Percentage improvement
  confidence: number;
}

function generateRecommendations(
  patterns: Pattern[],
  antiPatterns: AntiPattern[]
): LearnedRecommendation[] {
  const recommendations: LearnedRecommendation[] = [];

  // Recommendations from successful patterns
  for (const pattern of patterns) {
    if (pattern.confidence >= 0.7 && pattern.occurrences >= 5) {
      recommendations.push(patternToRecommendation(pattern));
    }
  }

  // Recommendations from anti-patterns (avoid these)
  for (const antiPattern of antiPatterns) {
    if (antiPattern.occurrences >= 3) {
      recommendations.push(antiPatternToRecommendation(antiPattern));
    }
  }

  // Cross-pattern analysis
  recommendations.push(...crossPatternRecommendations(patterns, antiPatterns));

  // Sort by expected impact
  return recommendations.sort((a, b) =>
    b.expectedBenefit.improvement - a.expectedBenefit.improvement
  );
}

function patternToRecommendation(pattern: Pattern): LearnedRecommendation {
  const typeMapping: Record<PatternType, RecommendationType> = {
    'graph_structure': 'graph_structure',
    'skill_combination': 'skill_selection',
    'execution_order': 'ordering_optimization',
    'parallelization': 'parallelization',
    'retry_strategy': 'retry_configuration',
    'resource_allocation': 'resource_allocation',
    'failure_recovery': 'retry_configuration',
  };

  return {
    recommendationId: generateRecommendationId(),
    type: typeMapping[pattern.type],
    title: `Use: ${pattern.name}`,
    description: pattern.description,
    applicability: pattern.conditions.map(c => ({
      condition: c.condition ?? c.toString(),
      required: true,
    })),
    expectedBenefit: {
      metric: 'reliability',
      improvement: pattern.outcomes.successRate * 100 - 50,  // Above 50% baseline
      confidence: pattern.confidence,
    },
    confidence: pattern.confidence,
    basedOn: {
      patterns: [pattern.patternId],
      antiPatterns: [],
      sampleSize: pattern.occurrences,
    },
  };
}

function antiPatternToRecommendation(antiPattern: AntiPattern): LearnedRecommendation {
  return {
    recommendationId: generateRecommendationId(),
    type: inferRecommendationType(antiPattern),
    title: `Avoid: ${antiPattern.name}`,
    description: `${antiPattern.description}. ${antiPattern.remediation}`,
    applicability: antiPattern.indicators.map(i => ({
      condition: `${i.metric} is ${i.comparison} ${i.threshold}`,
      required: true,
    })),
    expectedBenefit: {
      metric: antiPattern.type === 'resource_waste' ? 'cost' : 'reliability',
      improvement: antiPattern.severity === 'critical' ? 40 :
                   antiPattern.severity === 'high' ? 25 :
                   antiPattern.severity === 'medium' ? 15 : 5,
      confidence: Math.min(0.9, 0.5 + antiPattern.occurrences * 0.05),
    },
    confidence: Math.min(0.9, 0.5 + antiPattern.occurrences * 0.05),
    basedOn: {
      patterns: [],
      antiPatterns: [antiPattern.antiPatternId],
      sampleSize: antiPattern.occurrences,
    },
  };
}

function crossPatternRecommendations(
  patterns: Pattern[],
  antiPatterns: AntiPattern[]
): LearnedRecommendation[] {
  const recommendations: LearnedRecommendation[] = [];

  // Find complementary skill patterns
  const skillPatterns = patterns.filter(p => p.type === 'skill_combination');
  for (let i = 0; i < skillPatterns.length; i++) {
    for (let j = i + 1; j < skillPatterns.length; j++) {
      const overlap = findSkillOverlap(skillPatterns[i], skillPatterns[j]);
      if (overlap.length > 0) {
        recommendations.push({
          recommendationId: generateRecommendationId(),
          type: 'skill_selection',
          title: `Synergy: ${overlap.join(' + ')}`,
          description: `Skills ${overlap.join(', ')} appear in multiple successful patterns`,
          applicability: [{ condition: 'Task requires multiple capabilities', required: true }],
          expectedBenefit: {
            metric: 'quality',
            improvement: 20,
            confidence: 0.7,
          },
          confidence: 0.7,
          basedOn: {
            patterns: [skillPatterns[i].patternId, skillPatterns[j].patternId],
            antiPatterns: [],
            sampleSize: skillPatterns[i].occurrences + skillPatterns[j].occurrences,
          },
        });
      }
    }
  }

  return recommendations;
}
```

## Pattern Library Report

```yaml
patternLibrary:
  libraryId: "pl-9d8c7b6a-5e4f-3a2b-1c0d"
  lastUpdated: "2024-01-15T12:00:00Z"

  statistics:
    totalPatterns: 15
    totalAntiPatterns: 6
    totalRecommendations: 21
    executionsAnalyzed: 234
    timeSpan: "30 days"

  topPatterns:
    - patternId: "pat-001"
      name: "Fan-out-Fan-in"
      type: graph_structure
      description: "Distribute work to parallel nodes, then aggregate results"
      confidence: 0.92
      occurrences: 45
      outcomes:
        successRate: 0.89
        avgDuration: 12500
        avgCost: 0.045

    - patternId: "pat-002"
      name: "Validation First"
      type: execution_order
      description: "Run validation before expensive operations"
      confidence: 0.88
      occurrences: 67
      outcomes:
        successRate: 0.94
        avgDuration: 8200
        avgCost: 0.028

    - patternId: "pat-003"
      name: "Code Analysis Triple"
      type: skill_combination
      description: "code-complexity-analyzer + code-security-scanner + code-performance-analyzer"
      confidence: 0.85
      occurrences: 23
      outcomes:
        successRate: 0.91
        avgDuration: 15000
        avgCost: 0.062

  topAntiPatterns:
    - antiPatternId: "anti-001"
      name: "Sequential Bottleneck"
      type: bottleneck_structure
      severity: high
      occurrences: 12
      remediation: "Split large sequential node into parallelizable subtasks"

    - antiPatternId: "anti-002"
      name: "Retry Storm"
      type: excessive_retries
      severity: medium
      occurrences: 8
      remediation: "Add pre-validation to catch issues before execution"

  recommendations:
    - recommendationId: "rec-001"
      type: parallelization
      title: "Parallelize Independent Analysis"
      description: "When running multiple analysis skills, execute them in parallel"
      expectedBenefit:
        metric: duration
        improvement: 45
        confidence: 0.85
      basedOn:
        patterns: ["pat-001", "pat-003"]
        sampleSize: 68

    - recommendationId: "rec-002"
      type: ordering_optimization
      title: "Validate Early"
      description: "Move validation nodes to earliest possible position"
      expectedBenefit:
        metric: cost
        improvement: 30
        confidence: 0.88
      basedOn:
        patterns: ["pat-002"]
        antiPatterns: ["anti-001"]
        sampleSize: 67

  trends:
    - observation: "Success rate improving over time"
      metric: successRate
      change: +0.08
      period: "last 30 days"

    - observation: "Average cost decreasing"
      metric: avgCost
      change: -0.015
      period: "last 30 days"
```

## Integration Points

- **Input**: Execution traces from `dag-execution-tracer`
- **Input**: Performance data from `dag-performance-profiler`
- **Input**: Failure data from `dag-failure-analyzer`
- **Output**: Patterns and recommendations to `dag-graph-builder`
- **Output**: Optimization hints to `dag-task-scheduler`

## Best Practices

1. **Minimum Sample Size**: Require 3+ observations before extracting patterns
2. **Confidence Decay**: Reduce confidence for patterns not seen recently
3. **Context Matters**: Patterns should include applicable conditions
4. **Actionable Output**: Recommendations must be implementable
5. **Continuous Learning**: Update library with each new execution

---

Learn from history. Find what works. Continuously improve.