---
name: directed-test-input-generator
description: "Generate targeted test inputs to reach specific code paths and hard-to-reach behaviors in Python code. Use when: (1) Targeting uncovered branches or specific execution paths, (2) Need coverage-guided test generation, (3) Want to leverage LLM understanding of code semantics for meaningful test inputs, (4) Testing boundary conditions and edge cases systematically, (5) Combining symbolic reasoning with fuzzing. Provides path analysis, constraint solving, coverage-guided strategies, and LLM-driven semantic generation for comprehensive test input creation."
---

# Directed Test Input Generator

Generate test inputs that target specific code paths and hard-to-reach behaviors using program analysis, coverage feedback, and LLM-driven semantic understanding.

## Overview

Directed test input generation combines multiple techniques to create test inputs that explore specific execution paths:

1. **Path Analysis**: Extract control flow paths and their constraints
2. **Constraint Solving**: Generate inputs satisfying path conditions
3. **Coverage Guidance**: Use coverage feedback to iteratively reach new paths
4. **LLM Semantic Understanding**: Leverage code understanding for meaningful inputs

## Quick Start

### Basic Workflow

```python
# 1. Analyze code to extract paths
from scripts.path_analyzer import analyze_code_paths

paths = analyze_code_paths(source_code)

# 2. Generate inputs for each path
from scripts.input_generator import generate_test_suite

test_suite = generate_test_suite(paths)

# 3. Execute tests and measure coverage
for path_id, test_data in test_suite.items():
    result = execute_test(function, test_data['inputs'])
    verify_coverage(result, test_data['target_line'])
```

## Core Techniques

### 1. Path Analysis and Extraction

Extract execution paths and their constraints from code:

```python
from scripts.path_analyzer import analyze_code_paths, print_paths

source = """
def validate_age(age, country):
    if age < 0:
        raise ValueError("Invalid age")
    if age < 18:
        return "minor"
    if age >= 65 and country == "US":
        return "senior_us"
    return "adult"
"""

paths = analyze_code_paths(source)
print_paths(paths)

# Output:
# Path #0: exception handler (ValueError) (line 3)
#   Conditions:
#     - age < 0
#
# Path #1: if branch (line 5)
#   Conditions:
#     - age >= 0
#     - age < 18
#
# Path #2: if branch (line 7)
#   Conditions:
#     - age >= 0
#     - age >= 18
#     - age >= 65
#     - country == US
```

### 2. Constraint-Based Input Generation

Generate inputs that satisfy specific path constraints:

```python
from scripts.input_generator import TestInputGenerator

generator = TestInputGenerator()

# Generate input for path: age >= 65 AND country == "US"
constraints = {
    "age": ["age >= 65"],
    "country": ["country == US"]
}

inputs = generator.generate_for_path(constraints)
# Result: {"age": 65, "country": "US"}
```

### 3. Edge Case Generation

Generate boundary values systematically:

```python
from scripts.input_generator import EdgeCaseGenerator

# Generate edge cases for integer type
edge_cases = EdgeCaseGenerator.generate_edge_cases(int)
# [0, -1, 1, -2147483648, 2147483647, ...]

# Generate boundary values around a threshold
boundaries = EdgeCaseGenerator.generate_boundary_values(">", 65)
# [64, 65, 66, 67] - values around the boundary
```

### 4. Coverage-Guided Generation

Iteratively generate inputs to maximize coverage:

```python
def coverage_guided_testing(function, max_iterations=100):
    """
    Generate test inputs guided by coverage feedback.
    """
    covered_lines = set()
    test_corpus = []

    # Start with initial inputs
    current_inputs = generate_seed_inputs(function)

    for i in range(max_iterations):
        # Execute and measure coverage
        new_coverage = execute_with_coverage(function, current_inputs)

        if new_coverage - covered_lines:
            # New coverage reached - save inputs
            test_corpus.append(current_inputs)
            covered_lines.update(new_coverage)

        # Mutate inputs to explore new paths
        current_inputs = mutate_toward_uncovered(current_inputs, covered_lines)

    return test_corpus
```

See **[coverage_strategies.md](references/coverage_strategies.md)** for detailed coverage-guided strategies.

### 5. LLM-Driven Semantic Generation

Use LLM understanding of code semantics to generate meaningful inputs:

```python
# Example: LLM generates semantically appropriate inputs
function_code = """
def book_flight(passenger_age, departure_date, destination_country):
    # ... booking logic
"""

# LLM understands parameter semantics and generates realistic inputs:
llm_generated = [
    {
        "passenger_age": 35,           # Adult
        "departure_date": "2026-06-15",  # Future date
        "destination_country": "US"      # Valid country code
    },
    {
        "passenger_age": 8,            # Child passenger
        "departure_date": "2026-07-20",
        "destination_country": "UK"
    },
    {
        "passenger_age": 72,           # Senior citizen
        "departure_date": "2026-08-10",
        "destination_country": "CA"
    }
]
```

See **[llm_patterns.md](references/llm_patterns.md)** for comprehensive LLM-driven generation patterns.

## Common Use Cases

### Use Case 1: Target Specific Branch

Generate input to reach an uncovered branch:

```python
# Target: age > 100 branch
def check_age(age):
    if age > 100:
        return "exceptionally_old"  # Want to test this
    return "normal"

# Analyze paths
paths = analyze_code_paths(check_age_source)
target_path = paths[0]  # age > 100 path

# Generate input
generator = TestInputGenerator()
constraints = target_path.get_constraints()
test_input = generator.generate_for_path(constraints)
# Result: {"age": 101}

# Test
assert check_age(**test_input) == "exceptionally_old"
```

### Use Case 2: Systematic Boundary Testing

Test all boundaries in a function:

```python
def calculate_discount(age, is_premium):
    if age < 18:
        return 0.0
    elif age < 65:
        return 0.1 if is_premium else 0.05
    else:
        return 0.2 if is_premium else 0.15

# Generate boundary test suite
boundaries = [
    {"age": 17, "is_premium": False},  # Just below 18
    {"age": 18, "is_premium": False},  # Exactly 18
    {"age": 19, "is_premium": True},   # Just above 18
    {"age": 64, "is_premium": False},  # Just below 65
    {"age": 65, "is_premium": True},   # Exactly 65
    {"age": 66, "is_premium": False},  # Just above 65
]

for inputs in boundaries:
    result = calculate_discount(**inputs)
    print(f"{inputs} -> {result}")
```

### Use Case 3: Coverage-Driven Exploration

Maximize code coverage through iterative generation:

```python
def complex_function(x, y, z):
    if x > 10:
        if y < 5:
            if z == 0:
                return "path_A"  # Hard to reach
    elif x < 0:
        if y > 10:
            return "path_B"
    return "default"

# Coverage-guided approach
covered = set()
test_inputs = []

# Iteration 1: Try random input
input_1 = {"x": 5, "y": 3, "z": 1}
coverage_1 = execute_with_coverage(complex_function, input_1)
# Covers: default path

# Iteration 2: Mutate toward uncovered (x > 10)
input_2 = {"x": 11, "y": 7, "z": 1}
coverage_2 = execute_with_coverage(complex_function, input_2)
# Covers: x > 10 but not y < 5

# Iteration 3: Refine toward (y < 5)
input_3 = {"x": 11, "y": 4, "z": 1}
coverage_3 = execute_with_coverage(complex_function, input_3)
# Covers: x > 10 and y < 5 but not z == 0

# Iteration 4: Target (z == 0)
input_4 = {"x": 11, "y": 4, "z": 0}
result = complex_function(**input_4)
# SUCCESS: Reached "path_A"
```

## Advanced Strategies

### Hybrid Approach: Combining Techniques

```python
def hybrid_test_generation(function_source):
    """
    Combine multiple techniques for comprehensive coverage.
    """
    # 1. Symbolic analysis - extract paths
    paths = analyze_code_paths(function_source)

    # 2. Constraint solving - generate initial inputs
    symbolic_inputs = [generate_for_path(p.get_constraints()) for p in paths]

    # 3. Edge case generation - add boundary values
    edge_inputs = generate_edge_cases_for_function(function_source)

    # 4. LLM semantic generation - add realistic scenarios
    llm_inputs = query_llm_for_realistic_inputs(function_source)

    # 5. Coverage-guided refinement - fill gaps
    all_inputs = symbolic_inputs + edge_inputs + llm_inputs
    coverage = measure_coverage(all_inputs)

    # 6. Mutate to reach remaining uncovered paths
    refined_inputs = coverage_guided_mutation(all_inputs, coverage)

    return refined_inputs
```

### Branch Distance Minimization

Guide input generation toward uncovered branches:

```python
def minimize_branch_distance(target_condition, current_input):
    """
    Adjust input to get closer to satisfying target condition.

    Example:
        Target: x > 100
        Current: x = 50
        Distance: 100 - 50 + 1 = 51

        New input: x = 101 (distance = 0)
    """
    variable = target_condition.variable
    operator = target_condition.operator
    threshold = target_condition.value

    if operator == ">":
        return {**current_input, variable: threshold + 1}
    elif operator == "<":
        return {**current_input, variable: threshold - 1}
    elif operator == ">=":
        return {**current_input, variable: threshold}
    elif operator == "<=":
        return {**current_input, variable: threshold}
    elif operator == "==":
        return {**current_input, variable: threshold}

    return current_input
```

## Reference Documentation

### Detailed Coverage Strategies
See **[coverage_strategies.md](references/coverage_strategies.md)** for:
- Branch distance minimization
- Gradient-based input adjustment
- Symbolic constraint solving
- Mutation-based fuzzing
- Hybrid coverage-guided + LLM approaches

### LLM-Driven Patterns
See **[llm_patterns.md](references/llm_patterns.md)** for:
- Semantic input generation
- Path-directed generation with LLM
- Domain knowledge integration
- Adversarial input generation
- Property-based test generation
- Multi-step scenario generation

## Best Practices

### 1. Start with Symbolic Analysis
Extract paths first to understand what needs to be tested:
```python
paths = analyze_code_paths(source)
print(f"Found {len(paths)} paths to cover")
```

### 2. Generate Diverse Inputs
Combine multiple generation strategies:
- Constraint solving for known paths
- Edge cases for boundaries
- LLM for semantic realism
- Fuzzing for unexpected cases

### 3. Use Coverage Feedback
Monitor coverage and adjust strategy:
```python
if coverage_improvement < 0.01:
    # Switch from symbolic to fuzzing
    switch_to_mutation_based()
```

### 4. Validate Generated Inputs
Always check that generated inputs are valid:
```python
def validate_input(inputs, function_signature):
    required_params = get_parameters(function_signature)
    assert all(p in inputs for p in required_params)
```

### 5. Prioritize Hard-to-Reach Paths
Focus on paths requiring specific conditions:
```python
# Prioritize deeply nested conditions
priority_paths = [p for p in paths if len(p.conditions) > 3]
```

## Tools Reference

### Scripts

**path_analyzer.py** - Extract control flow paths and constraints from Python code
```bash
python scripts/path_analyzer.py < your_code.py
```

**input_generator.py** - Generate test inputs satisfying path constraints
```bash
python scripts/input_generator.py --paths paths.json
```

### Example Workflow

```python
from scripts.path_analyzer import analyze_code_paths
from scripts.input_generator import generate_test_suite

# 1. Analyze code
source = open("my_module.py").read()
paths = analyze_code_paths(source)

# 2. Generate inputs
test_suite = generate_test_suite(paths)

# 3. Run tests
for path_id, test_data in test_suite.items():
    print(f"Testing path {path_id}: {test_data['description']}")
    result = my_function(**test_data['inputs'])
    print(f"  Result: {result}")
```