---
name: watch
description: Start EvalView watch mode to automatically re-run regression checks whenever project files change.
---

# Watch Mode

Use this skill when the user wants continuous regression monitoring during development. Watch mode observes file changes and automatically re-runs `evalview check` with debounced triggers.

## What this does

EvalView's watch mode uses `watchdog` to monitor directories for file changes (`.py`, `.yaml`, `.yml`, `.json`, `.md`, `.txt`, `.toml`, `.cfg`, `.ini`). When a change is detected, it runs a regression check via the `gate()` API and displays a live scorecard with pass/fail status, score deltas, tool changes, and streak tracking.

## How to start watch mode

Watch mode is a CLI command (not an MCP tool). Help the user run it:

```
evalview watch
```

### Common options

- `--quick` — Skip LLM judge, deterministic checks only ($0 cost, sub-second)
- `--path src/ --path tests/` — Watch specific directories (default: current directory)
- `--test "my-test"` — Only check a specific test by name
- `--test-dir tests/evalview` — Path to test cases directory (default: `tests`)
- `--interval 1` — Debounce interval in seconds (default: 2.0)
- `--fail-on REGRESSION,TOOLS_CHANGED` — Comma-separated statuses that count as failure (default: REGRESSION)
- `--sound` — Terminal bell on regression

### Examples

```
# Basic: watch everything, full checks
evalview watch

# Fast development loop: no LLM judge, 1-second debounce
evalview watch --quick --interval 1

# Watch specific directories and one test
evalview watch --path src/ --path tests/ --test "calculator-division"

# Strict mode: fail on any behavioral change
evalview watch --fail-on REGRESSION,TOOLS_CHANGED,OUTPUT_CHANGED --sound
```

## Prerequisites

Watch mode requires the `watchdog` package. If not installed:

```
pip install evalview[watch]
```

## Notes

- Watch mode excludes `.evalview/`, `.git/`, `venv/`, `node_modules/`, `__pycache__/`, and other common non-source directories automatically.
- The initial check runs immediately on startup before watching begins.
- Results include a live scorecard with pass counts, regression counts, health percentage, and streak info.
- `--quick` mode is ideal for tight development loops since it costs nothing and runs in sub-second time.