---
name: mb-harness
description: >
  Set up deterministic commands, worktrees, and quality gates so agents can run safely in this repository.
---

# mb-harness — Harness engineering setup

- **What it does:** defines the execution harness around the repo, including commands, gates, and parallel-safe workflow.
- **Use it when:** the repository needs stronger agent guardrails before autonomous or multi-session work.
- **Input:** repository root and the project’s canonical build, test, and lint commands.
- **Output:** documented quality gates, optional Codex config, and a safer harness for agent execution.

## Goal
Turn the repo into a reliable “harness” for agents:
- clear entry points (AGENTS.md)
- reproducible commands (build/test/lint)
- mechanical checks (CI + MB lint)
- parallel-safe workflow (worktrees)

## Process

### 1) Codex project configuration (optional but recommended)
If you use Codex:
1. Create `.codex/` folder.
2. Create `.codex/config.toml` from `assets/codex-config.toml`.

Usage examples:
- default profile (coding): `codex`
- deep review: `codex --profile deep-review`

### 2) Document quality gates
In `AGENTS.md` (keep it short), list the canonical commands (examples):
- install deps
- lint / typecheck
- unit tests
- e2e tests

If the repo has UI or browser flows, explicitly document:
- Playwright command(s)
- agent-browser / browser MCP path (if available)
- where screenshots/videos/traces are stored
- which flows are considered release-critical

If the repo lacks them, add minimal scripts/Make targets.

### 3) Worktree workflow (parallel agents)
If multiple agents work in parallel:
- create worktrees per agent to avoid file conflicts
- merge only after passing gates

Example:
```bash
git worktree add ../wt-agent-1 -b agent-1
```

### 4) Add deterministic Memory Bank lint
If not already present, run `mb-garden` to add `scripts/mb-lint.mjs` and CI workflow.

### 4.1) Browser verification for UI projects
If the product has a UI:
- prefer Playwright / agent-browser / CDP-driven checks over “manual looks OK”
- persist artifacts (screenshots, videos, traces) into `.tasks/TASK-XXX/`
- document canonical browser verification commands in `.memory-bank/testing/index.md`

### 5) Optional: skill eval harness
If you iterate on skills heavily:
- use `codex exec --json` runs + deterministic graders (see OpenAI evals guidance)

## Definition of done
- `.codex/config.toml` exists (if using Codex) with coding + review profiles.
- AGENTS.md lists quality-gate commands.
- repo has a documented path for worktrees.
- Memory Bank lint exists and passes.
- UI repos have a documented browser-driven verification path.