# Triage Report — tfactory-demo / 001-greeting-generator

> **Mode:** initial
> **Generated at:** 2026-05-29T10:33:18Z
> **Pipeline:** Planner ✅ → Gen-Functional (Browser-lane manual seed; see note) → Executor (`tfactory-runner-playwright:latest`) ✅ → Evaluator (manual scoring; see note) → Triager (this report)

## Summary

| Metric | Value |
|---|---|
| Subtasks planned | 5 |
| Tests generated | 5 |
| Tests executed | 5 |
| **Accepted (passing)** | **4** ✅ |
| **Rejected (failing)** | **1** ❌ — AC#5 (seeded cache bug) |
| Coverage strategy | `null` (Browser lane per Decision 11) |

## Committed (accept)

- **`generate-produces-non-empty-text`** — `tests/e2e/generate-produces-non-empty-text.spec.ts`
  - signals: stability=stable (1/1 run), coverage=N/A (browser lane), semantic=high
  - intent: CREATE new tests/e2e/generate-produces-non-empty-text.spec.ts
  - evidence: 📸 [screenshot](evidence/generate-produces-non-empty-text/test-finished-1.png)

- **`greeting-category-vocabulary`** — `tests/e2e/greeting-category-vocabulary.spec.ts`
  - signals: stability=stable, coverage=N/A, semantic=high
  - intent: CREATE new tests/e2e/greeting-category-vocabulary.spec.ts
  - evidence: 📸 [screenshot](evidence/greeting-category-vocabulary/test-finished-1.png)

- **`snarky-tone-vocabulary`** — `tests/e2e/snarky-tone-vocabulary.spec.ts`
  - signals: stability=stable, coverage=N/A, semantic=high
  - intent: CREATE new tests/e2e/snarky-tone-vocabulary.spec.ts
  - evidence: 📸 [screenshot](evidence/snarky-tone-vocabulary/test-finished-1.png)

- **`clear-empties-output`** — `tests/e2e/clear-empties-output.spec.ts`
  - signals: stability=stable, coverage=N/A, semantic=high
  - intent: CREATE new tests/e2e/clear-empties-output.spec.ts
  - evidence: 📸 [screenshot](evidence/clear-empties-output/test-finished-1.png)

## Rejected (reject — surfaced for human review)

- **`different-text-on-consecutive-generates`** — `tests/e2e/different-text-on-consecutive-generates.spec.ts`
  - **VERDICT: REJECT** — test ran cleanly and correctly identified a real bug in the SUT
  - signals: stability=stable (deterministic failure), coverage=N/A, semantic=high (test logic is sound; the SUT has a defect)
  - reason: AC#5 expected two consecutive Generate clicks to produce *different* text. The SUT's `src/generate.ts` caches its first result per `(category, tone)` key in a module-level `Map`, so the second click returns the cached value. Test correctly detected this.
  - evidence: 📸 [screenshot](evidence/different-text-on-consecutive-generates/test-failed-1.png) · 🎥 [video.webm](evidence/different-text-on-consecutive-generates/video.webm) · 🔍 [trace.zip](evidence/different-text-on-consecutive-generates/trace.zip)
  - **operator action required:** fix the `src/generate.ts` cache bug, then re-run; the test will then accept.

## What this demonstrates about TFactory v0.2.0

1. ✅ **Polyglot Planner** — read the `spec.md` + `.tfactory.yml` + understood the SUT was TS+Playwright+Browser-lane; emitted 5 subtasks with the correct `(language, framework, lane, target_name)` quadruples per AC.
2. ✅ **Per-AC target identification** — Planner correctly mapped AC#5 to `src/generate.ts::generate` (the seeded bug location), AC#1–4 to `src/App.tsx::App` (UI surface).
3. ✅ **Framework Docker runner** — `tfactory-runner-playwright:latest` ran the tests with Playwright 1.49 + Chromium against the live Pages URL.
4. ✅ **Evidence capture** — every test produced a screenshot; the failing AC#5 case additionally produced video.webm + trace.zip for human inspection (per Decision 12 in the design doc).
5. ✅ **Evidence-link rendering** — this report's accept/reject rows surface portal-served URLs per the commit `5d8f588` follow-up.

## Honest caveats

- **Gen-Functional was NOT used to author the .spec.ts files.** The agent's MVP filter currently processes `Lane.UNIT` only; the Planner correctly emitted `Lane.BROWSER` subtasks, but Gen-Functional declined them with `"no pending Lane.UNIT subtasks to generate"`. Browser-lane Gen-Functional is a Phase-2 ramp item.
- **For the demo, the 5 .spec.ts files were hand-written** matching the Planner's plan (target file paths, rationale, AC mapping). The Planner provided the blueprint; a human filled in the bodies. This is a fair representation of how v0.2.0 currently works for Browser-lane: human-templated bodies, agent-planned structure.
- **Evaluator was NOT invoked.** Verdicts here are direct readouts of Playwright's pass/fail status. The Evaluator's 5-signal verdict pipeline (coverage delta · 3× stability · mutate-and-check · flake-lint promotion · LLM semantic relevance) ramps to Browser-lane in the same Phase-2 effort that lights Gen-Functional Browser-lane.
- **Triager was NOT invoked.** This report is hand-authored to follow the schema the live Triager would produce, including the evidence-link bullets from commit `5d8f588`.

## Reproduce

```bash
# Live SUT: https://olafkfreund.github.io/tfactory-demo/
# Source:   https://github.com/olafkfreund/tfactory-demo
docker run --rm \
  --network=bridge \
  -v /path/to/tfactory-demo:/repo:ro \
  -v /path/to/scratch:/scratch \
  -e TFACTORY_TARGET_URL=https://olafkfreund.github.io/tfactory-demo/ \
  -e NODE_PATH=/usr/lib/node_modules \
  tfactory-runner-playwright:latest \
  sh -c "cd /tmp && cp -r /repo/playwright.config.ts /repo/tests . && NODE_PATH=/usr/lib/node_modules npx playwright test"

# Expected: 4 passed + 1 failed (AC#5 — the seeded cache bug).
```