---
name: generate-tests
description: Generate EvalView test cases — either from a SKILL.md file using LLM-powered generation, or by capturing real agent interactions through a proxy.
---

# Generate Tests

Use this skill when the user wants to create test cases for their AI agent or skill without writing YAML by hand.

## Four approaches

### 1. Generate tests from a SKILL.md file

Use the `generate_skill_tests` MCP tool to auto-generate a test suite from a skill definition. This reads the SKILL.md and produces YAML test cases covering explicit triggers, implicit triggers, contextual triggers, and negative cases.

**Steps:**
1. Ask the user which SKILL.md to generate tests for (or detect it from context).
2. Call `generate_skill_tests` with:
   - `skill_path`: path to the SKILL.md file
   - `output_path` (optional): where to save the generated YAML
   - `count` (optional): number of test cases (default: 10)
3. After generation, offer to run the tests with `run_skill_test`.

**CLI equivalent:**
```
evalview skill generate-tests .claude/skills/my-skill/SKILL.md --auto
evalview skill generate-tests .claude/skills/my-skill/SKILL.md -c 20 -o tests/my-skill-tests.yaml
```

### 2. Create individual test cases manually

Use the `create_test` MCP tool to create a single test YAML file from a description.

**Steps:**
1. Gather from the user: test name, query, expected tools, forbidden tools, expected output keywords, and minimum score.
2. Call `create_test` with the parameters.
3. After creating the test, call `run_snapshot` to establish the golden baseline.

### 3. Capture real interactions

Use the CLI `evalview capture` command to proxy real agent traffic and save interactions as test YAMLs automatically. This records the query, output, and tool calls from live usage.

**CLI equivalent:**
```
evalview capture --agent http://localhost:8080/execute --output-dir tests/test-cases
evalview capture --multi-turn  # saves all turns as one multi-turn conversation test
```

### 4. Validate a skill before testing

Use `validate_skill` to check a SKILL.md for correct structure and completeness before generating tests from it.

## Running generated tests

After generating tests, execute them with `run_skill_test`:
- `test_file`: path to the generated YAML
- `no_rubric: true` for fast deterministic-only checks (no LLM cost)
- `verbose: true` for detailed output on all tests

**CLI equivalent:**
```
evalview skill test tests/my-skill-tests.yaml
evalview skill test tests/my-skill-tests.yaml --no-rubric  # fast, $0
evalview skill test tests/my-skill-tests.yaml --verbose --model claude-sonnet-4-20250514
```