# Benchmarks This document explains how to run the Criterion benchmarks, how datasets are chosen/created, and how to generate persistent sample datasets for reproducible measurements. The benchmark suite measures: - Sequential vs parallel processing - With and without line-numbered code blocks - Multiple dataset sizes (tiny, small, optionally medium) By default, runs are silent to avoid skewing timings with console I/O. --- ## Quick start - Run (parallel by default): - Linux/macOS: - `cargo bench --bench context_bench` - Windows PowerShell: - `cargo bench --bench context_bench` - Include the medium dataset (heavier, disabled by default): - Linux/macOS: - `CB_BENCH_MEDIUM=1 cargo bench --bench context_bench` - Windows PowerShell: - `$env:CB_BENCH_MEDIUM=1; cargo bench --bench context_bench` - HTML reports: - Open: `target/criterion/report/index.html` - Or per-benchmark: `target/criterion/context_builder/*/report/index.html` --- ## Parallel vs sequential Parallel processing is enabled by default via the `parallel` feature (rayon). - Force sequential: - `cargo bench --no-default-features --bench context_bench` - Force parallel (even if defaults change): - `cargo bench --features parallel --bench context_bench` Note: Benchmarks compare both “line_numbers” and “no_line_numbers” modes. Line numbering does additional formatting work and is expected to be slower. --- ## Silence during benchmarks Benchmarks set `CB_SILENT=1` once at startup so logs and prompts don’t impact timings. - To see output during benchmarks: - Linux/macOS: - `CB_SILENT=0 cargo bench --bench context_bench` - Windows PowerShell: - `$env:CB_SILENT=0; cargo bench --bench context_bench` Prompts are auto-confirmed inside benches, so runs are fully non-interactive. --- ## Dataset selection Each scenario picks an input dataset with the following precedence: 1) If `./samples//project` exists, it is used. 2) Else, if `CB_BENCH_DATASET_DIR` is set, `//project` is used. 3) Else, a synthetic dataset is generated in a temporary directory for the run. Datasets used: - tiny: ~100 text files (fast sanity checks) - small: ~1,000 text files (default performance checks) - medium: ~5,000 text files (only when `CB_BENCH_MEDIUM=1` is set) Default filters in the benches focus on text/code: `rs`, `md`, `txt`, `toml`. Common ignored directories: `target`, `node_modules`. Binary files are generated but skipped by filters. --- ## Reproducing results For more stable and reproducible measurements: - Generate persistent datasets into `./samples/` (see below). - Keep your machine’s background activity low during runs. - Run each scenario multiple times and compare Criterion reports. --- ## Generating persistent sample datasets You have two options to generate datasets into `./samples`: ### Option A: Cargo bin (feature-gated) The repository provides a generator binary gated behind the `samples-bin` feature. - Linux/macOS: - `cargo run --no-default-features --features samples-bin --bin generate_samples -- --help` - Windows PowerShell: - `cargo run --no-default-features --features samples-bin --bin generate_samples -- --help` Examples: - Generate default presets (tiny, small) into `./samples`: - `cargo run --no-default-features --features samples-bin --bin generate_samples` - Include medium and large: - `cargo run --no-default-features --features samples-bin --bin generate_samples -- --presets tiny,small,medium --include-large` - Only one preset with custom parameters: - `cargo run --no-default-features --features samples-bin --bin generate_samples -- --only small --files 5000 --depth 4 --width 4 --size 1024` - Clean output before generating: - `cargo run --no-default-features --features samples-bin --bin generate_samples -- --clean` - Dry run (print plan only): - `cargo run --no-default-features --features samples-bin --bin generate_samples -- --dry-run` ### Option B: Standalone compile with rustc If you prefer not to use the Cargo feature gating, compile the script directly: - Linux/macOS: - `rustc scripts/generate_samples.rs -O -o generate_samples && ./generate_samples --help` - Windows PowerShell: - `rustc scripts/generate_samples.rs -O -o generate_samples.exe; .\generate_samples.exe --help` Examples mirror Option A; just replace the leading command with `./generate_samples` (or `.\generate_samples.exe` on Windows). --- ## Directory layout of generated samples The generator produces datasets under `./samples//project`, which benches discover automatically. Each `project` tree contains: - `src/`, `docs/`, `assets/` with nested subdirectories and text files - `target/`, `node_modules/` populated with noise (ignored by default) - Top-level `README.md`, `Cargo.toml` - Binary `.bin` files sprinkled to validate binary handling It’s recommended to add `/samples` to `.gitignore` if not already present. --- ## Comparing modes - Sequential vs Parallel: - Sequential (no rayon): `cargo bench --no-default-features --bench context_bench` - Parallel (rayon): `cargo bench --features parallel --bench context_bench` - With vs Without line numbers: - Both modes are exercised in each run; consult the per-benchmark report pages for timings. --- ## Troubleshooting - Benchmarks produce no output: - Expected. They run with `CB_SILENT=1`. Set `CB_SILENT=0` to see logs. - Medium dataset missing: - Set the flag explicitly: `CB_BENCH_MEDIUM=1`. - Or pre-generate samples so the benches find `./samples/medium/project`. - Reports are empty or unchanged: - Remove previous results and re-run: - `rm -rf target/criterion` (Linux/macOS) - `Remove-Item -Recurse -Force target\criterion` (Windows PowerShell) - Sequential vs parallel deltas are small: - On tiny datasets, overheads dominate. Use small or medium for more signal. - Try enabling/disabling line numbers to observe formatting costs. --- Happy benchmarking!