# Testing Guide

This document describes all testing in the mdmdview repository: what is tested, how tests are organized, how to run them, and the infrastructure that supports them.

## Quick Reference

```bash
# Run the full test suite (~1,100+ tests, ~8 minutes)
nice -n 10 mdtimeout 600 cargo test

# Run a single module (seconds)
cargo test markdown_renderer
cargo test app
cargo test mermaid_renderer
cargo test pikchr_renderer
cargo test window_state

# Run with output
cargo test -- --nocapture

# Pre-release validation (format + lint + build + test)
.\scripts\prerelease.ps1

# Full test suite with optional coverage
.\scripts\full_test.ps1
.\scripts\full_test.ps1 -SkipCoverage

# Full test runner (Rust binary, generates HTML report)
cargo run --release --bin full_test
cargo run --release --bin full_test -- --skip-coverage
cargo run --release --bin full_test -- --quick --skip-mermaid

# Code coverage
cargo llvm-cov --workspace --summary-only
```

---

## Test Categories

The codebase uses four categories of testing:

| Category | Count | Runtime | Description |
|----------|-------|---------|-------------|
| [Unit tests](#unit-tests) | ~1,100+ | ~8 min (full), seconds (per module) | Rust `#[test]` functions across 29 source files |
| [D2 conformance tests](#d2-conformance-tests) | 12 | ~30-60 sec | Geometric invariant checks on D2 diagram output |
| [Visual regression tests](#visual-regression-tests) | 26 cases | ~5-10 min | Pixel-diff screenshot comparison for markdown rendering |
| [Mermaid visual tests](#mermaid-visual-tests) | 15 cases | ~2-5 min | Screenshot comparison against Mermaid CLI reference |

---

## Unit Tests

Unit tests are co-located with the source code using Rust's `#[cfg(test)]` convention. Every major module has its own test suite.

### Test Distribution

#### Core Application (`src/`)

| File | Tests | What It Validates |
|------|------:|-------------------|
| `src/markdown_renderer.rs` | ~403 | Markdown parsing, element conversion, image caching, table layout, link handling, emoji decoding, search highlighting, grapheme-aware text matching, element rect tracking |
| `src/app.rs` | ~224 | App state management, keyboard shortcuts, file loading/saving, navigation history (back/forward), view mode switching, search state, async file loading, error handling, drag-and-drop queue |
| `src/mermaid_renderer.rs` | ~202 | LRU cache management, SVG rasterization, texture caching, Mermaid configuration (theme, security, font, background), diagram type detection, width bucketing, worker pool |
| `src/window_state.rs` | ~45 | Window position/size persistence, file I/O, Windows registry operations, environment variable handling, config directory detection, geometry sanitization |
| `src/pikchr_renderer.rs` | ~43 | Pikchr rendering pipeline, SVG rasterization via usvg/resvg, texture caching, dark mode flag |
| `src/table_support/column_spec.rs` | ~41 | Table column width calculation, layout policy application, responsive column sizing |
| `src/main.rs` | ~19 | Icon generation, build metadata extraction, window initialization |
| `src/image_decode.rs` | ~18 | Image format parsing (PNG, JPEG, GIF, BMP, WebP, SVG), error handling, embedded asset loading |
| `src/d2_renderer.rs` | ~11 | D2 diagram rendering, SVG generation, texture caching |
| `src/table_support/metrics.rs` | ~9 | Table metrics computation |
| `src/lru_cache.rs` | ~7 | Generic LRU cache insertion, eviction, capacity limits |
| `src/sample_files.rs` | ~11 | Sample file content presence and integrity |
| `src/emoji_catalog.rs` | ~6 | Emoji shortcode mapping, asset verification |
| `src/theme.rs` | ~6 | Theme color configuration |
| `src/emoji_assets.rs` | ~3 | Emoji asset integrity |
| `src/lib.rs` | 1 | Process priority setting (Windows `BELOW_NORMAL_PRIORITY_CLASS`) |

#### D2 Diagram Library (`crates/d2/src/`)

| File | Tests | What It Validates |
|------|------:|-------------------|
| `crates/d2/src/edge_routing.rs` | ~39 | Edge routing algorithms, orthogonal path generation, label placement, nudge vectors |
| `crates/d2/src/parser.rs` | ~31 | D2 syntax parsing: statements, containers, properties, escape sequences, edge declarations |
| `crates/d2/src/compiler.rs` | ~22 | D2 graph compilation, style attributes, arrow types, containers, directionality |
| `crates/d2/src/svg_render.rs` | ~13 | SVG generation from compiled graph structure, label dimensions |
| `crates/d2/src/layout.rs` | ~12 | Graph layout algorithms, node positioning |
| `crates/d2/src/lib.rs` | ~12 | End-to-end D2 rendering pipeline |
| `crates/d2/src/graph.rs` | ~8 | Graph data structure operations |
| `crates/d2/src/text.rs` | ~7 | Text measurement and layout |
| `crates/d2/src/shapes.rs` | ~6 | Shape rendering and bounding box calculation |
| `crates/d2/src/layout_sugiyama.rs` | ~5 | Sugiyama hierarchical layout algorithm |
| `crates/d2/src/geo.rs` | ~5 | Geometry utilities: distance, line intersection |

#### Build Script

| File | Tests | What It Validates |
|------|------:|-------------------|
| `build.rs` | 5 | Semantic version parsing, Windows version encoding (4 u16 components), author extraction, ISO 8601 timestamp formatting |

### Test Infrastructure

The unit tests use several patterns to support thorough testing without mocks:

**Thread-local test fixtures** — Instead of mocking, the codebase uses thread-local globals that tests can set to inject specific behaviors. These are automatically cleaned up by RAII guard types.

App test fixtures (`src/app.rs`):
- `FORCED_APP_ACTIONS` — Force specific UI actions to trigger
- `FORCED_OPEN_PATH` / `FORCED_SAVE_PATH` — Simulate file dialog results
- `FORCED_LOAD_ERROR` — Simulate file load failures
- `FORCED_SCAN_ERROR` / `FORCED_SCAN_ENTRY_ERROR` — Simulate directory scan errors
- `FORCED_READ_LOSSY_ERROR` — Simulate lossy read failures
- `FORCE_THREAD_SPAWN_ERROR` — Simulate async thread spawn failures

Renderer test fixtures (`src/markdown_renderer.rs`):
- `FORCED_RENDER_ACTIONS` — Inject render actions
- `FORCED_TABLE_POLICIES` — Override table layout policies
- `FORCED_PARSE_ERROR` — Inject markdown parse errors
- `FORCE_EMOJI_DECODE_ERROR` — Simulate emoji decoding errors

Environment variable guards (`src/window_state.rs`, `src/markdown_renderer.rs`):
- `EnvGuard` / `EnvVarGuard` — RAII types that save and restore env vars
- `env_lock()` — Mutex that serializes env var access across test threads

I/O error simulation:
- `FailingWriter` — Mock `Write` implementation that returns errors on demand
- `force_parse_error_once()` — Thread-local one-shot error injection

### Windows Test Priority

The test binary automatically lowers its CPU priority to `BELOW_NORMAL_PRIORITY_CLASS` on Windows via a `.CRT$XCU` initializer in `src/lib.rs`. This prevents the ~8-minute test suite from starving interactive applications, particularly over remote desktop sessions. The `test_process_priority_is_below_normal` test verifies this behavior.

Use `nice -n 10` when running `cargo test` to also lower the compilation priority.

---

## D2 Conformance Tests

**Location:** `crates/d2/tests/d2_conformance.rs`
**Fixtures:** `tests/d2_conformance/fixtures/` (24 `.d2` files)
**Output:** `tests/d2_conformance/actual/` (SVG artifacts for debugging)

These integration tests validate four geometric invariants that every D2 diagram must satisfy:

| Invariant | Description |
|-----------|-------------|
| `label_not_overlapping_node` | Edge labels must not overlap node rectangles |
| `edge_not_through_node` | Edge paths must not pass through non-endpoint nodes |
| `nodes_not_overlapping` | Node rectangles must not overlap (except container nesting) |
| `label_near_edge` | Edge labels must be within 50px of their edge path |

The conformance suite renders every `.d2` fixture file, parses the resulting SVG, extracts node rects and edge paths, then checks all four invariants. Failures dump the offending SVG for visual inspection.

**Supporting tests** in the conformance file verify the geometry utilities themselves: AABB overlap/containment, Liang-Barsky line-AABB intersection, point-to-segment distance, SVG path tokenization, and attribute extraction.

**Python tooling** in `tools/d2_conformance/` provides additional analysis:
- `d2_svg_parse.py` — Parse SVG structure
- `generate_reference.py` / `generate_actual.py` — Render fixtures
- `compare.py` — Diff reference vs. actual
- `report.py` — Generate summary report

### Running

```bash
# Run D2 conformance tests
cargo test d2_conformance

# Dump all fixture SVGs for inspection
cargo test dump_all_fixture_svgs -- --ignored
```

---

## Visual Regression Tests

**Manifest:** `tests/regression/manifest.toml`
**Cases:** `tests/regression/cases/` (26 markdown files)
**Runner:** `tools/regression/runner.py`
**CI:** `.github/workflows/visual-regression.yml`

The visual regression suite captures screenshots of mdmdview rendering specific markdown documents and compares them pixel-by-pixel against a baseline. This catches rendering regressions that unit tests cannot.

### How It Works

1. **Reference generation** — The runner renders each test case markdown through markdown-it (or another reference renderer) into HTML, then captures a Playwright/Chromium screenshot at a fixed viewport.
2. **Baseline capture** — mdmdview renders the same markdown using its `--screenshot` mode at the same viewport dimensions.
3. **Pixel diff** — The runner compares actual vs. baseline images, computing pixel differences and percentage deviation.
4. **Pass/fail** — A case passes if differences are below the configured thresholds (`max_pixels` and `max_percent`).

### Compare Modes

- **`mdmdview`** (default) — Compares mdmdview's current output against a known-good mdmdview baseline. Used for most cases.
- **`reference`** — Compares mdmdview output against the markdown-it HTML reference. Used when you want to check against a canonical renderer.

### Test Cases (26)

| ID | Title | Notes |
|----|-------|-------|
| 001 | Headings and paragraphs | |
| 002 | Lists and nesting | |
| 003 | Inline formatting | |
| 004 | Code blocks | |
| 005 | Tables and wrapping | |
| 006 | Images and missing assets | |
| 007 | Emoji coverage | |
| 008 | Mermaid diagrams | `compare_mode = "mdmdview"` (no reference renderer equivalent) |
| 009 | Unicode and RTL | |
| 010 | Large document | Two snapshots: `top` (scroll 0.0) and `mid` (scroll 0.5) |
| 011-026 | Table variants | Alignments, escaping, missing cells, wide tables, line breaks, unicode, dark mode, inline elements, lists, blockquotes |

### Configuration

Defaults in `manifest.toml`:
- Viewport: 1280x720, DPR 1.0
- Theme: light
- Zoom: 1.0
- Diff thresholds: max 500 pixels or 10% different
- Wait: 2000ms, settle 3 frames

Per-case overrides are supported for theme, viewport, compare mode, scroll position, and thresholds.

### Running

```bash
# Prerequisites
python -m pip install -r tools/regression/requirements.txt
python -m playwright install chromium
cargo build --release

# Full workflow
python tools/regression/runner.py update-reference   # Generate reference screenshots
python tools/regression/runner.py update-baseline    # Capture mdmdview baselines
python tools/regression/runner.py run                # Compare actual vs. baseline

# Single case
python tools/regression/runner.py run --case 001-headings

# With deterministic fonts
python tools/regression/runner.py run --test-fonts tools/regression/fonts
```

### CI Integration

The `visual-regression.yml` workflow runs on PRs that touch `src/`, `tests/regression/`, or `tools/regression/`. It builds a release binary and runs a smoke test against case `001-headings`, uploading diff artifacts on failure.

### Adding a New Case

1. Create a markdown file in `tests/regression/cases/`
2. Add an entry to `manifest.toml` with an ID, path, and title
3. Keep content deterministic — no network references or timestamps
4. Run `update-baseline` to capture the initial baseline

---

## Mermaid Visual Tests

**Cases:** `tests/mermaid_visual/cases/` (15 markdown files)
**Runner:** `tools/mermaid_visual_check.py`
**Output:** `tests/mermaid_visual/{actual,reference,diff,report}/`

These tests compare mdmdview's embedded Mermaid rendering against the official Mermaid CLI (`mmdc`) output.

### Test Cases (15)

Covers the major Mermaid diagram types: flowchart, flowchart with subgraphs, sequence, sequence with alt/loop, class, complex class, ER, gantt, state, nested state, gitgraph, journey, mindmap, pie, timeline.

### Prerequisites

- Built mdmdview release binary
- Python 3.11+ with Pillow
- Node.js + `npx` or `mmdc` on PATH
- `mdscreensnap` on PATH

### Running

```bash
# All cases
python tools/mermaid_visual_check.py

# Single case
python tools/mermaid_visual_check.py --case flowchart

# Adjust thresholds
python tools/mermaid_visual_check.py --threshold-percent 5.0 --threshold-pixels 1000
```

---

## CI Pipelines

### Release Workflow (`.github/workflows/release.yml`)

Triggered by version tags (`v*`) or manual dispatch. Runs the full quality gate:

1. `cargo fmt --all -- --check` — Formatting
2. `cargo clippy --all-targets -- -D warnings` — Linting (zero warnings)
3. `cargo test --all-targets` — Full test suite
4. `cargo build --release` — Release binary
5. `cargo wix --no-build` — MSI installer

All steps must pass before artifacts are published.

### Visual Regression Workflow (`.github/workflows/visual-regression.yml`)

Triggered on PRs touching rendering code. Runs a smoke test:

1. Setup Python 3.11 + Playwright Chromium
2. Build release binary
3. Run regression on `001-headings`
4. Upload diff/report artifacts

### Pre-Release Script (`scripts/prerelease.ps1`)

Local equivalent of the release CI pipeline — run before tagging:

```powershell
.\scripts\prerelease.ps1
# Runs: fmt check → clippy → release build → test
```

### Full Test Script (`scripts/full_test.ps1`)

Runs the complete test suite with optional coverage:

```powershell
.\scripts\full_test.ps1              # Tests + coverage
.\scripts\full_test.ps1 -SkipCoverage  # Tests only
```

Uses `cargo-llvm-cov` for coverage (install with `cargo install cargo-llvm-cov`).

### Full Test Runner (`full_test.exe`)

Single Rust binary that runs all quality checks, tests, Mermaid visual comparisons, and coverage — then generates a self-contained HTML report.

```bash
# Full run (all phases)
cargo run --release --bin full_test

# Skip coverage (much faster)
cargo run --release --bin full_test -- --skip-coverage

# Quick run, no Mermaid visual tests
cargo run --release --bin full_test -- --quick --skip-mermaid

# All options
cargo run --release --bin full_test -- --help
```

**Phases** (run sequentially, fast-to-slow):

1. **Quality checks** — `cargo fmt --check`, `cargo clippy --workspace --all-targets -- -D warnings`
2. **Unit + D2 tests** — `cargo test --release --lib --tests --workspace --no-fail-fast -- --test-threads=1`
   - Uses `--lib --tests` (not `--all-targets`) to avoid relinking the running binary on Windows
3. **Mermaid visual tests** — builds mdmdview, screenshots each of 15 Mermaid test cases, pixel-diffs against reference images
   - Thresholds: <12% diff and <45,000 pixels (per-channel tolerance: 60)
   - Reference images in `tests/mermaid_visual/reference/`
   - Actual output in `tests/mermaid_visual/actual_full_test/`
4. **Coverage** — `cargo llvm-cov --workspace --branch --json --no-cfg-coverage` (gracefully skipped if `cargo-llvm-cov` not installed)

**CLI flags:**

| Flag | Description |
|------|-------------|
| `--skip-quality` | Skip fmt/clippy checks |
| `--skip-coverage` | Skip coverage analysis |
| `--skip-mermaid` | Skip Mermaid visual tests |
| `--quick` | Exclude ignored tests |
| `--help` | Show usage |

**Output:** Self-contained HTML report saved to `tests/results/{YYYY.MM.DD} - full test report.html` with auto-increment on date collision. Features dark mode support, dashboard cards, collapsible test breakdown, and Mermaid visual results table.

**Exit code:** 1 if any phase has hard failures, 0 otherwise.

---

## Testing Philosophy

### No Mocks

Tests validate real logic. Instead of mock objects, the codebase uses thread-local injection points to control specific behaviors (file dialog results, I/O errors, thread spawn failures) while keeping all other code paths real.

### Test Oracles

Each test category uses a different oracle — a definition of "correct" that can be checked mechanically:

| Category | Oracle |
|----------|--------|
| Unit tests | Explicit assertions on state, output, and invariants |
| D2 conformance | Four geometric invariants (no overlaps, labels near edges, paths don't cross nodes) |
| Visual regression | Pixel-diff against known-good baselines within tolerance thresholds |
| Mermaid visual | Pixel-diff against official Mermaid CLI output |

### CPU Priority

The full test suite takes ~8 minutes and is CPU-intensive. Two mechanisms prevent it from starving the system:

1. **Automatic priority lowering** — The test binary sets `BELOW_NORMAL_PRIORITY_CLASS` on Windows via a CRT initializer
2. **Manual nice** — Use `nice -n 10 mdtimeout 600 cargo test` to also lower compilation priority

### What Is Not Tested

- **No property-based testing** — All tests are example-based (no quickcheck/proptest)
- **No fuzz testing** — No fuzzing harnesses for the markdown parser
- **No benchmarks** — Performance validation is manual
- **No end-to-end UI automation** — The egui app loop is not driven by automated UI tests; visual regression covers rendering accuracy instead