# Triage Report — tfactory-demo / 001-greeting-generator > **Mode:** initial > **Generated at:** 2026-05-29T10:33:18Z > **Pipeline:** Planner ✅ → Gen-Functional (Browser-lane manual seed; see note) → Executor (`tfactory-runner-playwright:latest`) ✅ → Evaluator (manual scoring; see note) → Triager (this report) ## Summary | Metric | Value | |---|---| | Subtasks planned | 5 | | Tests generated | 5 | | Tests executed | 5 | | **Accepted (passing)** | **4** ✅ | | **Rejected (failing)** | **1** ❌ — AC#5 (seeded cache bug) | | Coverage strategy | `null` (Browser lane per Decision 11) | ## Committed (accept) - **`generate-produces-non-empty-text`** — `tests/e2e/generate-produces-non-empty-text.spec.ts` - signals: stability=stable (1/1 run), coverage=N/A (browser lane), semantic=high - intent: CREATE new tests/e2e/generate-produces-non-empty-text.spec.ts - evidence: 📸 [screenshot](evidence/generate-produces-non-empty-text/test-finished-1.png) - **`greeting-category-vocabulary`** — `tests/e2e/greeting-category-vocabulary.spec.ts` - signals: stability=stable, coverage=N/A, semantic=high - intent: CREATE new tests/e2e/greeting-category-vocabulary.spec.ts - evidence: 📸 [screenshot](evidence/greeting-category-vocabulary/test-finished-1.png) - **`snarky-tone-vocabulary`** — `tests/e2e/snarky-tone-vocabulary.spec.ts` - signals: stability=stable, coverage=N/A, semantic=high - intent: CREATE new tests/e2e/snarky-tone-vocabulary.spec.ts - evidence: 📸 [screenshot](evidence/snarky-tone-vocabulary/test-finished-1.png) - **`clear-empties-output`** — `tests/e2e/clear-empties-output.spec.ts` - signals: stability=stable, coverage=N/A, semantic=high - intent: CREATE new tests/e2e/clear-empties-output.spec.ts - evidence: 📸 [screenshot](evidence/clear-empties-output/test-finished-1.png) ## Rejected (reject — surfaced for human review) - **`different-text-on-consecutive-generates`** — `tests/e2e/different-text-on-consecutive-generates.spec.ts` - **VERDICT: REJECT** — test ran cleanly and correctly identified a real bug in the SUT - signals: stability=stable (deterministic failure), coverage=N/A, semantic=high (test logic is sound; the SUT has a defect) - reason: AC#5 expected two consecutive Generate clicks to produce *different* text. The SUT's `src/generate.ts` caches its first result per `(category, tone)` key in a module-level `Map`, so the second click returns the cached value. Test correctly detected this. - evidence: 📸 [screenshot](evidence/different-text-on-consecutive-generates/test-failed-1.png) · 🎥 [video.webm](evidence/different-text-on-consecutive-generates/video.webm) · 🔍 [trace.zip](evidence/different-text-on-consecutive-generates/trace.zip) - **operator action required:** fix the `src/generate.ts` cache bug, then re-run; the test will then accept. ## What this demonstrates about TFactory v0.2.0 1. ✅ **Polyglot Planner** — read the `spec.md` + `.tfactory.yml` + understood the SUT was TS+Playwright+Browser-lane; emitted 5 subtasks with the correct `(language, framework, lane, target_name)` quadruples per AC. 2. ✅ **Per-AC target identification** — Planner correctly mapped AC#5 to `src/generate.ts::generate` (the seeded bug location), AC#1–4 to `src/App.tsx::App` (UI surface). 3. ✅ **Framework Docker runner** — `tfactory-runner-playwright:latest` ran the tests with Playwright 1.49 + Chromium against the live Pages URL. 4. ✅ **Evidence capture** — every test produced a screenshot; the failing AC#5 case additionally produced video.webm + trace.zip for human inspection (per Decision 12 in the design doc). 5. ✅ **Evidence-link rendering** — this report's accept/reject rows surface portal-served URLs per the commit `5d8f588` follow-up. ## Honest caveats - **Gen-Functional was NOT used to author the .spec.ts files.** The agent's MVP filter currently processes `Lane.UNIT` only; the Planner correctly emitted `Lane.BROWSER` subtasks, but Gen-Functional declined them with `"no pending Lane.UNIT subtasks to generate"`. Browser-lane Gen-Functional is a Phase-2 ramp item. - **For the demo, the 5 .spec.ts files were hand-written** matching the Planner's plan (target file paths, rationale, AC mapping). The Planner provided the blueprint; a human filled in the bodies. This is a fair representation of how v0.2.0 currently works for Browser-lane: human-templated bodies, agent-planned structure. - **Evaluator was NOT invoked.** Verdicts here are direct readouts of Playwright's pass/fail status. The Evaluator's 5-signal verdict pipeline (coverage delta · 3× stability · mutate-and-check · flake-lint promotion · LLM semantic relevance) ramps to Browser-lane in the same Phase-2 effort that lights Gen-Functional Browser-lane. - **Triager was NOT invoked.** This report is hand-authored to follow the schema the live Triager would produce, including the evidence-link bullets from commit `5d8f588`. ## Reproduce ```bash # Live SUT: https://olafkfreund.github.io/tfactory-demo/ # Source: https://github.com/olafkfreund/tfactory-demo docker run --rm \ --network=bridge \ -v /path/to/tfactory-demo:/repo:ro \ -v /path/to/scratch:/scratch \ -e TFACTORY_TARGET_URL=https://olafkfreund.github.io/tfactory-demo/ \ -e NODE_PATH=/usr/lib/node_modules \ tfactory-runner-playwright:latest \ sh -c "cd /tmp && cp -r /repo/playwright.config.ts /repo/tests . && NODE_PATH=/usr/lib/node_modules npx playwright test" # Expected: 4 passed + 1 failed (AC#5 — the seeded cache bug). ```