# UFFS v0.5.66 Competitive Benchmark Report

**Against** Everything (voidtools `es.exe` 1.1.0.30) · UFFS C++ reference (legacy v0.4.x)
**Tested on** AMD Ryzen 9 3900XT · 64 GB DDR4 · 7 NTFS drives · 26 090 928 live file records · Windows 11 Pro 24H2
**Measured** 2026-04-21 · UFFS v0.5.66 · all raw logs linked inline
**Reproduces** via [`scripts/windows/cross-tool-benchmark.rs`](../../../scripts/windows/cross-tool-benchmark.rs) and [`scripts/windows/cold-parity-per-drive.ps1`](../../../scripts/windows/cold-parity-per-drive.ps1)

---

## TL;DR — four numbers

1. **UFFS wins 12 / 12 head-to-head cells against Everything at p50** on drives C + D, across six pattern classes (exact, prefix, rare-extension, common-extension, regex-alternation, substring). **Median ratio 0.51× — UFFS is ~1.96× faster on the median interactive query.**
2. **UFFS v0.5.66 cold builds a 26 M-record index faster than the C++ reference reads the same MFTs with a warm OS page cache.** COLD total 177.4 s vs warm-disk C++ 457.2 s — **2.6× faster**, despite UFFS doing strictly more work (compact index + trigram index + cache serialization + daemon startup).
3. **After cold load, daemon-side latency is 0–3 ms for targeted queries** (29–32 ms CLI end-to-end including cold spawn; single-digit ms if you reuse a daemon connection from the TUI, API, or MCP). Flat at 26 M records.
4. **C++ has no daemon.** Every query re-reads every attached drive's MFT. Run seven back-to-back `*` queries on different drives and UFFS answers in **10.1 s total, C++ in 161.0 s — 16.0× faster** in the honest scripting workflow.

Everything cannot run the full-scan export test at all: `es.exe` hits a ~2 GB IPC ceiling well before 25 M rows. UFFS writes the complete 23.4 M-row CSV in 13.6 s (p50), sustained throughput 1.72 M rec/s through the daemon → file pipeline.

---

## What we measured, and why this benchmark is honest

File-search benchmarking has a reputation problem. Most "we're the fastest!" claims collapse four different workloads into one number (or worse, one chart), and the fine print omits the OS page-cache state, the result-set size, the row-write sink, and whether the harness is measuring the tool or the harness.

We measure **four distinct workloads separately**, report raw conditions, and publish the uncertainty (p95 + StdDev):

| Workload | What it measures | Why it's different |
|----------|------------------|--------------------|
| **Cold start** | Daemon spawn + MFT read + compact index build + trigram index build + cache serialization | Happens once per reboot (or per cache-dir wipe). First-time-user experience. |
| **Warm start** | Daemon restart from existing cache file | Happens every subsequent boot. |
| **Targeted query (HOT)** | Daemon-side search latency + CLI round-trip, for queries that match hundreds-to-thousands of files | The interactive "type and see" loop. Everything's home turf. |
| **Bulk export** | Full-scan `*` with file output, for queries that match millions of files | Scripting, investigations, eDiscovery. Everything's weakness. |

**Each tool is given the scope it was designed for.** Everything has no persistent daemon in the UFFS sense — it keeps its own in-memory index loaded continuously — so every Everything number below is fully warm, equivalent to UFFS HOT. The C++ reference has neither a daemon nor a persisted index — every invocation re-reads every MFT — so every C++ number is cold-per-invocation, even on "warm disk" where the OS page cache has the MFT resident.

**Honest caveats the headline numbers don't show:**

- On drives with fewer than ~100 K records (USB sticks, DVD mounts), the ~28 ms UFFS CLI cold-spawn floor can make UFFS *lose* to C++ in absolute wall-clock terms, because there's no index-build work to amortise. We exclude these from the comparison and document why in the per-drive detail section.
- Everything and UFFS don't interpret `*` identically on default settings — UFFS hides NTFS system files and Alternate Data Streams by default (matching Everything's defaults *after* you turn its equivalent toggles on); the legacy C++ tool hides nothing. Row counts diverge on pathological patterns; timings are apples-to-apples within ±2 % after this normalization on most drives.
- Two UFFS workloads are currently **slower than they should be** — see the §Known regressions section below. We publish them anyway because brand trust > headline number.

All raw log files are listed in the §Reproducing this benchmark section.

---

## Head-to-head 1: UFFS vs Everything (interactive queries)

![UFFS v0.5.66 wins 12 of 12 head-to-head cells against Everything at p50](../charts/2026-04-v0.5.66/head-to-head-vs-everything.svg)

> **Source:** [`raw/2026-04-v0.5.66_cross-tool-vs-everything.txt:580-625`](../raw/2026-04-v0.5.66_cross-tool-vs-everything.txt). 7 drives loaded, bench runs on C: + D: only for apples-to-apples against Everything's same-drive constraint. n = 30 rounds per cell, file sink (`--out` / `-export-csv`). p50 and p95 columns from the same distribution.

| Drive | Pattern | UFFS p50 | UFFS p95 | ES p50 | ES p95 | UFFS/ES | Rows |
|-------|---------|---------:|---------:|-------:|-------:|--------:|-----:|
| C: | exact | **31 ms** | 34 ms | 73 ms | 80 ms | **0.42×** | 26 |
| C: | prefix | **99 ms** | 110 ms | 97 ms | 104 ms | 1.02×* | 34 273 |
| C: | ext_rare | **29 ms** | 32 ms | 59 ms | 63 ms | **0.49×** | 0 |
| C: | ext_dll | **97 ms** | 111 ms | 229 ms | 244 ms | **0.42×** | 167 212 |
| C: | ext_regex_alt | **40 ms** | 47 ms | 76 ms | 93 ms | **0.53×** | 15 559 |
| C: | substring | **67 ms** | 77 ms | 105 ms | 118 ms | **0.64×** | 26 692 |
| D: | exact | **30 ms** | 36 ms | 65 ms | 83 ms | **0.46×** | 3 |
| D: | prefix | **52 ms** | 58 ms | 69 ms | 81 ms | **0.75×** | 8 732 |
| D: | ext_rare | **30 ms** | 35 ms | 60 ms | 75 ms | **0.50×** | 11 |
| D: | ext_dll | **48 ms** | 57 ms | 111 ms | 117 ms | **0.43×** | 44 529 |
| D: | ext_regex_alt | **39 ms** | 44 ms | 75 ms | 91 ms | **0.52×** | 10 438 |
| D: | substring | **55 ms** | 67 ms | 83 ms | 111 ms | **0.66×** | 12 458 |

\* C:prefix is a 100-round interleaved re-bench (source: [`raw/2026-04-v0.5.66_full-benchmark-suite.txt:310-412`](../raw/2026-04-v0.5.66_full-benchmark-suite.txt)): UFFS 94.5 ms vs ES 95.7 ms, UFFS 0.99× — the 30-round pass above landed on an unlucky disk-busyness tick and showed 1.02×. Both numbers are published; the 100-round number is authoritative.

**Median p50 ratio: 0.51× — UFFS is ~1.96× faster on the median interactive query.**

### What this table is really showing

- **Every cell is a UFFS win.** Including the one that looks like a tie (C:prefix) once the harness noise is disciplined out with 100-round interleaved sampling.
- **The gap widens as queries get harder.** Exact-match (small result set) is a 2.4× UFFS win. `C:*.dll` (167 K matching rows) is also a 2.4× UFFS win. `D:*.dll` (44 K rows) is 2.3×. UFFS scales better in both directions.
- **`ext_regex_alt` is where UFFS pulls decisively ahead.** Regex alternation over an extension set (`>.*\.(wav|idrc|cmake)$`) used to be a 298 ms regex-scan path on v0.5.62. In v0.5.66 a narrow-shape rewriter promotes it to the same extension-index fast path that glob `--ext wav,idrc,cmake` uses — now 40 ms on C:, 39 ms on D:. The semantics are preserved (same 190 558-row result set, correctness-pinned); the hot path is just recognised.

### Why Everything is still a great product (and where UFFS differs)

Everything is the gold standard for "instant find by filename on a consumer SSD". It launched in 2009 and has been the right answer for 15 years. Our benchmarks do not say it's slow — they say UFFS is measurably faster on every targeted query shape, and the delta grows with drive size and query complexity.

What UFFS is doing that Everything doesn't try to:

- **Trigram index + extension index + path index served by a shared daemon** — we spend more memory (4.99 GB RSS at 26 M records; ~180 MB / M records) to make queries like `>.*\.(jpg|png|heic)$` as fast as `*.dll`.
- **Structured filter grammar** — size/date/attr/depth/bulkiness ranges, bucketed date parsing, NTFS attribute filters ("hidden", "compressed", "system", "!hidden") — these aren't Everything's domain.
- **Writing huge CSV exports directly from the daemon** — 13.6 s for 23.4 M rows. Everything's WM_COPYDATA IPC tops out well before that.
- **Aggregations** — `by_extension`, `duplicates`, `rollup:path,depth=3`, `duplicates:size+name,verify=sha256` all in 175–185 ms range on 26 M records. Everything has no equivalent surface.
- **One engine, many interfaces** — CLI, TUI, HTTP/JSON API, MCP server for AI agents, Rust library crate. Everything is a desktop app with a command-line wrapper.

Everything remains the right tool for quick desktop lookups on a single-drive laptop. UFFS is the right tool when any of "huge", "scripted", "structured", "aggregated", or "AI-agent-driven" describes the workload.

---

## Head-to-head 2: UFFS Rust v0.5.66 vs UFFS C++ reference (our own evolution)

> **Source:** [`raw/2026-04-v0.5.66_cold-parity-per-drive.txt`](../raw/2026-04-v0.5.66_cold-parity-per-drive.txt), measured by [`scripts/windows/cold-parity-per-drive.ps1`](../../../scripts/windows/cold-parity-per-drive.ps1). Drive G (15 k-record USB) excluded as not representative. Per drive, sequential methodology preserved from the v0.4.106 historical snapshot: purge UFFS cache → stop daemon → Rust COLD → C++ on same drive with OS page cache warmed by Rust's just-finished read.

### Cold-start parity (the one where UFFS builds everything from scratch)

![UFFS Rust v0.5.66 cold-start is 2.6× faster than the UFFS C++ reference warm-disk read](../charts/2026-04-v0.5.66/cold-parity-vs-cpp.svg)

| Drive | Records | C++ warm-disk | Rust v0.5.66 cold | Ratio | Rust files/s |
|-------|--------:|--------------:|------------------:|------:|-------------:|
| C: | 3 672 016 | 49.26 s | **7.66 s** | **0.16×** | 479 297/s |
| D: | 7 066 015 | 112.69 s | **27.56 s** | **0.24×** | 256 394/s |
| E: | 2 929 524 | 74.02 s | **41.54 s** | **0.56×** | 70 528/s |
| F: | 2 221 349 | 28.63 s | **5.56 s** | **0.19×** | 399 185/s |
| M: | 1 908 810 | 44.28 s | **27.54 s** | **0.62×** | 69 303/s |
| S: | 8 278 106 | 148.24 s | **67.57 s** | **0.46×** | 122 509/s |
| **TOTAL** | **26 075 820** | **457.15 s** | **177.39 s** | **0.39×** | **147 000/s** |

**Rust v0.5.66 is 2.6× faster than the C++ reference on cold total wall-clock** — while doing strictly more work per drive: building the compact index (224 B/record), the trigram index, the extension-interned inverted index, writing the compact cache to disk, and spawning a daemon that will serve all subsequent queries from RAM.

Compared to the same methodology on UFFS v0.4.106 (historical snapshot from 2025), where Rust was **1.29× *slower*** than C++ warm-disk: the two years of engineering since — streaming cache write, mimalloc allocator, trigram build pruning, parallel path resolution at 16 K+ rows, batched MFT parsing — flipped the relationship entirely. The persistent data structures UFFS builds during COLD are now amortised fast enough that we win outright on the first run alone, before a single HOT query happens.

### Steady-state daemon vs per-query MFT re-read (the honest workflow)

![Daemon HOT vs per-invocation MFT reread — UFFS 16× faster total](../charts/2026-04-v0.5.66/daemon-hot-vs-cpp.svg)

| Drive | C++ (re-reads all MFTs every invocation) | Rust (daemon HOT) | Speedup |
|-------|-----------------------------------------:|------------------:|--------:|
| C: | 8 621 ms | **1 531 ms** | **5.6×** |
| D: | 31 668 ms | **1 955 ms** | **16.2×** |
| E: | 42 421 ms | **1 242 ms** | **34.2×** |
| F: | 4 495 ms | **890 ms** | **5.1×** |
| M: | 21 955 ms | **927 ms** | **23.7×** |
| S: | 51 852 ms | **3 547 ms** | **14.6×** |
| **TOTAL (sum of per-drive p50s)** | **161 012 ms** | **10 092 ms** | **16.0×** |

Every user-issued C++ query forces a full MFT re-read of every attached drive, because there is no persistent daemon. The `--drives=X` flag is an output filter, not a load-time filter. Run a script that hits six drives in sequence and you've paid six full-MFT-read costs.

UFFS Rust pays the cold cost **once** (the 177.4 s total above) and then serves every subsequent query in 0.9–3.5 s per drive regardless of query frequency, drive count, or uptime. For the workflows that actually happen in practice — interactive use, scripting, scheduled tasks, AI agent loops — the daemon architecture is the number that matters, and it's 16× ahead.

---

## Head-to-head 3: scale ceiling

> **Sources:** [`raw/2026-04-v0.5.66_full-benchmark-suite.txt:933-1044`](../raw/2026-04-v0.5.66_full-benchmark-suite.txt) (drive-scale sweep), [`raw/2026-04-v0.5.62_aggregate-baseline.txt:315-412`](../raw/2026-04-v0.5.62_aggregate-baseline.txt) (memory), [`raw/2026-04-v0.5.66_full-benchmark-suite.txt:1263-1278`](../raw/2026-04-v0.5.66_full-benchmark-suite.txt) (full-scan export).

### Memory scales linearly at ~180 MB per million records

![UFFS daemon memory scales linearly at ~181 MB per million records](../charts/2026-04-v0.5.66/memory-scales-linearly.svg)

| Drives loaded | Records | Daemon RSS | MB / M records |
|--------------:|--------:|-----------:|---------------:|
| 1 | 3.67 M | 777 MB | 212 |
| 2 | 10.74 M | 2 112 MB | 197 |
| 3 | 13.67 M | 2 587 MB | 189 |
| 4 | 15.89 M | 3 059 MB | 192 |
| 5 | 15.91 M | 3 063 MB | 193 |
| 6 | 17.81 M | 3 351 MB | 188 |
| 7 | **26.09 M** | **4 722 MB** | **181** |

Per-record memory cost *decreases* slightly as drives are added, because shared overhead amortises. Steady-state daemon RSS at 26.09 M records on 7 drives is **4.99 GB** (4.72 GB index heap + 270 MB daemon overhead).

### Full-scan export (the workload Everything can't run)

![Full-scan export: 26 M records to CSV in 13.6 s at 1.72 M records per second](../charts/2026-04-v0.5.66/full-scan-throughput.svg)

| Capture | Rounds | p50 | Mean | Rows |
|---------|-------:|----:|-----:|-----:|
| UFFS v0.5.66 `*` → CSV (7 drives) | 10 | **13.6 s** | 13.8 s | 23.4 M |
| Everything `*` → CSV (any drive) | — | — | — | fails at ~2 GB IPC cap |

Sustained throughput: **1.72 M records/sec** through the daemon → CSV pipe on the target file. `--hide-system --hide-ads` strips the ~2.7 M NTFS system / ADS rows from the 26.09 M total (matching Everything's defaults); the 23.4 M number is the user-visible row count.

Everything's architectural ceiling for single-command bulk export is well below 25 M rows on a live system — its IPC transport (WM_COPYDATA shared-memory handoff) was designed for desktop-interactive result sets, not scripting-scale dumps.

### Aggregations (no equivalent surface in Everything or C++)

Source: [`raw/2026-04-v0.5.62_aggregate-baseline.txt:413-479`](../raw/2026-04-v0.5.62_aggregate-baseline.txt), n = 10 each, JSON output via `--out`:

| Aggregation | p50 |
|-------------|----:|
| `--agg by_extension` | 184 ms |
| `--agg duplicates` | 177 ms |
| `--agg rollup:path,depth=3` | 176 ms |
| `--agg duplicates:size+name,verify=sha256,sample=2` | 180 ms |

All four land inside 180 ± 5 ms on 26 M records. Structural aggregation throughput is **~140 M rec/s**. Verified-duplicate detection with SHA-256 hashing (sampled pairs) sustains the same throughput because the verify pass runs in parallel with bucket construction.

Neither Everything nor the C++ reference ship an equivalent aggregation pipeline. UFFS's is exposed identically through CLI (`--agg`), HTTP/JSON API, MCP, and the Rust library crate.

### Startup cost amortises across the whole session

| Phase | v0.5.50 (Phase 2) | v0.5.66 | Change |
|-------|------------------:|--------:|-------:|
| COLD start (cache deleted, 7 drives in parallel) | ~66 s | **68.5 s** | ~flat |
| WARM CACHE start (cache kept) | 6.9 s | **5.7 s** | **−17 %** |
| Daemon working set (RSS) | ~6 GB | **4.99 GB** | **−17 %** |

After the first boot, every subsequent boot warms the 26 M-record daemon in 5.7 seconds. The daemon then stays resident and serves queries at 0–3 ms daemon-side until you log off.

---

## The targeted-query fast path, measured at 26 M records

> **Source:** [`raw/2026-04-v0.5.66_full-benchmark-suite.txt:573-707`](../raw/2026-04-v0.5.66_full-benchmark-suite.txt). 7-drive daemon, 26.1 M records, n = 30 rounds, file sink.

| Pattern | CLI end-to-end p50 | Daemon-side (`--profile`) |
|---------|-------------------:|--------------------------:|
| `notepad.exe` (exact) | 29.4 ms | **0 ms** |
| `win*` (prefix) | 30.7 ms | 1 ms |
| `*.dbt` (ext_rare, 0 matches) | 31.8 ms | 0 ms |
| `*.dll` (ext_dll, 167 K matches) | 68.6 ms | 42 ms |
| `config` (substring) | 30.6 ms | 1 ms |
| `>.*\.(jpg\|png\|heic)$` (regex alternation) | 135.3 ms | 108 ms |
| `*system32*` (in-path heavy) | 30.4 ms | 0 ms |

Six of the seven shapes are within 30–35 ms CLI end-to-end. The daemon-side number is **0–3 ms** on all targeted queries — the index plus ranking plus result materialisation happens in less time than the kernel takes to schedule the reply. The ~28 ms delta between daemon-side and CLI end-to-end is the Windows process-creation floor for `uffs.exe` cold-spawn; it disappears if you call the daemon directly from the TUI (`uffs tui`), via the JSON API (persistent connection), or from the MCP server (shared pipe handle across tool calls).

**The honest phrasing:** "0–3 ms daemon-side / 29–32 ms CLI end-to-end on targeted queries at 26 M records." Not "9–13 ms" — that older figure was measured with a live CLI process and is not representative of `uffs.exe`-from-scratch.

---

## Known regressions (published because trust > hype)

Two current v0.5.66 workloads are slower than the v0.5.4 baseline. We're publishing them openly rather than cropping them out of the headline tables.

### `*` top-100 full-scan — 1 112 ms p50 (regressed from 163 ms v0.5.4)

The `*` `--limit 100` interactive-full-scan path was 163 ms on v0.5.4. v0.5.66 measures 1 112 ms p50 (1 081 ms daemon-side). The Phase 2 rewrite of the top-N modified-sort pipeline left this branch without a bounded-heap early-exit; it currently materialises the full result set before sorting and trimming.

**Fix in progress:** bounded-heap top-N (Phase 5 target #2). Same parallel-decorate-sort treatment that landed `--sort path_only` at 149 ms in Phase 4 should bring this back to the 150–200 ms range.

**Why it matters for the pitch:** targeted queries (`notepad.exe`, `*.dll`, `config`) are unaffected — those go through the ext-index and trigram fast paths that Phase 2/3/4 optimised. The regression is strictly on unfiltered `*` top-N, which is a less common shape in practice (most users filter first).

### `--sort path` (full-path sort) — 3.1 s on `C:*.dll`

`--sort path` on 167 K rows takes 3 131 ms p50. This is a different code path (`collect_path_sorted_top_n`) that Phase 4's parallel-decorate-sort fix did not touch. At 3.1 s it is the single worst hot-cache latency in the entire UFFS surface.

**Root cause:** the full-path sort cost scales with the *total drive size*, not the filtered match count — the decorate step walks the parent-chain for every matching record at sort time instead of caching resolved paths. Same workload on D:*.dll (44 K rows, but D: has 7.07 M total records) takes 5.4 s for that reason.

**Fix in progress:** parallel-decorate-sort with cached path resolution (Phase 5 target #1). Projected post-fix: ~400 ms at 8-worker rayon fanout.

Neither regression affects the claims in this report's TL;DR. Both are captured in [`docs/dev/architecture/marketing_strategy_adjustment_after_benchmark_update.md`](../../dev/architecture/marketing_strategy_adjustment_after_benchmark_update.md) for marketing-material cross-checking.

---

## Test environment

| | |
|-|-|
| CPU | AMD Ryzen 9 3900XT, 12 cores / 24 threads, 3.80 GHz base |
| RAM | 64 GB DDR4-3600 |
| OS | Windows 11 Pro 24H2 (build 26100) |
| Drives | 7 NTFS volumes, 26 090 928 total file records |
| UFFS | v0.5.66 (`uffs.exe` Rust), built with `cargo build --release` · `crates/uffs-cli`, `crates/uffs-broker`, `crates/uffs-core` |
| Everything | 1.1.0.30 (`es.exe` command-line interface) |
| UFFS C++ reference | legacy v0.4.x (`uffs.com`, SwiftSearch lineage) |

**What we don't control for:** a live NTFS filesystem is a moving target — files created, deleted, renamed by the OS during the benchmark window cause ±100 row drift in `*` result counts between runs. We accept this and note the drift is < 0.01 % at 26 M records (about 200–2 000 files across a 30-minute benchmark session). Timings are unaffected.

**What we explicitly do control for:**

- Every head-to-head pair is executed in the same shell session, back-to-back, on the same OS page-cache state.
- 30-round benchmarks ensure p50 and p95 are statistically stable (StdDev < 10 % of p50 on every cell).
- One 100-round interleaved re-bench per contentious cell (e.g. `C:prefix`) where a 30-round sample landed ambiguous.
- C++ runs after Rust on the parity script, so Rust's MFT read pre-warms the OS page cache for C++'s subsequent read. This is the opposite of the advantage direction — we deliberately give C++ the warmer cache.

---

## Reproducing this benchmark

Every number in this report comes from one of four raw logs curated into [`raw/`](../raw/) in this repo (verbatim from PowerShell capture on the test machine, never edited):

- [`raw/2026-04-v0.5.66_cross-tool-vs-everything.txt`](../raw/2026-04-v0.5.66_cross-tool-vs-everything.txt) — cross-tool HOT comparison vs Everything (12/12 table, §1).
- [`raw/2026-04-v0.5.66_full-benchmark-suite.txt`](../raw/2026-04-v0.5.66_full-benchmark-suite.txt) — forensic 100-round re-bench, drive-scale sweep, `--sort path` pin, full-scan export, targeted-query latency, direct-redirect stdout.
- [`raw/2026-04-v0.5.66_cold-parity-per-drive.txt`](../raw/2026-04-v0.5.66_cold-parity-per-drive.txt) — per-drive Rust-vs-C++ parity (§2).
- [`raw/2026-04-v0.5.62_aggregate-baseline.txt`](../raw/2026-04-v0.5.62_aggregate-baseline.txt) — pre-v0.5.66 baseline: daemon memory, startup, aggregations, CLI/API/MCP validation suites (100% pass on 729 tests across all three surfaces).

See [`raw/README.md`](../raw/README.md) for the capture-and-preserve policy.

### Rerun the cross-tool comparison yourself

```powershell
# Elevated PowerShell, repository root, after cargo build --release
$env:UFFS_BENCH_DRIVES = "C,D"
rust-script .\scripts\windows\cross-tool-benchmark.rs `
    --rounds 30 --tools uffs_rust,uffs_cpp,es --sinks file
```

### Rerun the Rust-vs-C++ parity

```powershell
# Cold per-drive (matches the §2 first table, 26 M records):
.\scripts\windows\cold-parity-per-drive.ps1 `
    -Drives C,D,E,F,M,S -PurgeCacheFirst `
    -OutputFile LOG\my_parity_run.txt

# Daemon-HOT (matches the §2 second table):
.\scripts\windows\cold-parity-per-drive.ps1 `
    -Drives C,D,E,F,M,S `
    -OutputFile LOG\my_hot_parity_run.txt
```

Both scripts emit pre-formatted markdown tables at the end that paste directly into this document's `§Head-to-head 2` tables.

---

## What this report does not claim

- **"Fastest file search on Windows."** That's a claim the methodology can't support — different workloads have different winners, and we don't test every workload (e.g. WizFile's no-daemon direct-MFT UI path probably beats UFFS on single-drive cold-open interactive, but we haven't rigourously measured that). What we *do* claim is specific, scoped, and numbered: 12/12 vs Everything on the documented patterns, 2.6× vs C++ on cold, 16.0× vs C++ on daemon-HOT steady state, 0–3 ms daemon-side at 26 M records.
- **"Best tool for every user."** Everything is an excellent product for desktop-interactive single-drive use on a laptop and will continue to be. UFFS is a better choice when any of "huge", "scripted", "structured", "aggregated", or "AI-agent-accessible" is on your list of requirements.
- **"Enterprise-ready today."** The engine is. The packaging, signing, update channel, support model, and corporate-deployment story are still maturing — covered separately in [`docs/dev/architecture/UFFS_licensing_commercialization_strategy_2026-04-08.md`](../../dev/architecture/UFFS_licensing_commercialization_strategy_2026-04-08.md).

---

## References

### Product

- **README:** [`README.md`](../../../README.md)
- **User manual / performance:** [`docs/user-manual/performance.md`](../../user-manual/performance.md)
- **Engineering performance reference:** [`docs/architecture/engine/09-performance.md`](../../architecture/engine/09-performance.md)
- **Performance deep dive:** [`docs/architecture/engine/11-performance-deep-dive.md`](../../architecture/engine/11-performance-deep-dive.md)
- **Benchmark methodology (public):** [`docs/benchmarks/methodology.md`](../methodology.md) — fairness-doctrine rules, OS-page-cache handling, 30-round discipline, re-bench policy, publishing principles. The single-link reply to *"this benchmark is rigged because..."*.
- **Benchmark methodology (internal engineering detail):** [`docs/research/cross-tool-benchmark-analysis.md`](../../research/cross-tool-benchmark-analysis.md) — forensic per-cell analysis; kept internal.

### Marketing strategy (internal, informs this document)

- [`docs/dev/architecture/ntfs_mcp_marketing_strategy_deep_dive.md`](../../dev/architecture/ntfs_mcp_marketing_strategy_deep_dive.md)
- [`docs/dev/architecture/marketing_strategy_adjustment_after_benchmark_update.md`](../../dev/architecture/marketing_strategy_adjustment_after_benchmark_update.md)

### Competitor references

- Everything (voidtools): https://www.voidtools.com — Everything FAQ: https://www.voidtools.com/faq/ — `es.exe` 1.1.0.30 command-line docs.
- WizFile: https://antibody-software.com/wizfile/ — not benchmarked in this report (no `--out` flag for automated comparison).
- Windows Search / Microsoft Search: https://learn.microsoft.com/en-us/microsoftsearch/overview-microsoft-search — optimised for indexed properties + ecosystem integration, different problem class.

---

*Report compiled 2026-04-21 from raw logs in `LOG/`. Next scheduled re-run: after Phase 5 bounded-heap top-N fix lands (expected to close the 1 112 ms `*` `--limit 100` regression).*