---
name: builder-smoke-test
description: Smoke test the Agent Builder feature branch end-to-end against a hermetic project scaffolded by the skill (linked to the current worktree). Covers workspace reconciliation, stored agents/skills CRUD, ownership, visibility, stars, registry/library Copy flow, picker allowlists, model policy, RBAC role gating, role impersonation UI, builder defaults, infrastructure diagnostics, channels, and Studio + Agent Builder UI. Trigger when validating the agent-builder feature branch, PRs that touch packages/server, packages/playground, packages/playground-ui agent-builder routes, or builder EE code paths.
---
# Builder Smoke Test
End-to-end smoke testing of the Agent Builder feature set against a hermetic project the skill scaffolds at `~/mastra-builder-smoke-tests/builder-smoke` (configurable). The project links to the current worktree via `pnpm` `link:` overrides, so changes to packages under `packages/`, `stores/`, `auth/`, `channels/`, `observability/`, `browser/`, and `client-sdks/` take effect on the next `mastra dev` restart.
This skill is for **branch QA** — it complements the release-time `mastra-smoke-test`. It exercises the Builder EE surface (stored entities, RBAC, registry, infra, channels) using a minimal, predictable project rather than the kitchen-sink `examples/agent`.
## ⚠️ Mandatory Test Checklist
**Use `task_write` to track progress.** Run ALL sections unless `--test` or `--scope` narrows the run.
**Do not skip sections unless you hit an actual blocker.** "Seemed complex" or "I'll come back to it" are not valid reasons. Attempt every step — only stop when you literally cannot proceed. Report what you tried and what blocked you.
| # | Section | Reference | When required |
| --- | ---------------------- | -------------------------------- | ----------------------------------------------------------------------- |
| 1 | **Setup** | `references/setup.md` | Always |
| 2 | **Workspace** | `references/workspace.md` | `--test workspace` or full |
| 3 | **Reconciliation** | `references/reconciliation.md` | Steps 1 + 5 only; steps 2/3/4/6 are out of smoke-test scope (see below) |
| 4 | **Defaults** | `references/defaults.md` | `--test defaults` or full |
| 5 | **Model Policy** | `references/model-policy.md` | `--test model-policy` or full |
| 6 | **Skills** | `references/skills.md` | `--test skills` or full |
| 7 | **Registry** | `references/registry.md` | `--test registry` or full |
| 8 | **Agents** | `references/agents.md` | `--test agents` or full |
| 9 | **Picker Allowlists** | `references/picker-allowlist.md` | `--test pickers` or full |
| 10 | **Favorites** | `references/favorites.md` | `--test favorites` or full (formerly `stars`) |
| 11 | **Permissions / RBAC** | `references/permissions.md` | `--test permissions` or full |
| 12 | **Infrastructure** | `references/infrastructure.md` | `--test infrastructure` or full |
| 13 | **Channels** | `references/channels.md` | `--test channels` or full |
| 14 | **UI** | `references/ui.md` | `--test ui` or full |
| 15 | **Auth** | `references/auth.md` | `--test auth` or `--auth on` |
### Execution flow
1. **Confirm the project directory.** Before scaffolding, ask the user where they want `$PROJECT_DIR` to live. Offer the default (`~/mastra-builder-smoke-tests/builder-smoke`) as a suggestion. Skip the question if they already passed `--dir` or have `$BUILDER_SMOKE_TEST_DIR` exported. See `references/setup.md` step 0.
2. **Read the reference file** for each section you're about to run.
3. **Under `--auth on`, extract the session cookie before running any other section.** The WorkOS cookie is `httpOnly`, so `curl` cannot mint it and `document.cookie` cannot read it. The scaffold ships a debug route at `GET /smoke-test/cookie` gated by `SMOKE_TEST_COOKIE_LEAK=1`. Follow the **"Extracting the session cookie for curl (auth on)"** section below before touching any auth-on endpoint. **Do not pivot to UI-only testing because curl is "blocked" — the cookie route is the unblock path.**
4. **Seed non-owner data after the server has booted at least once.** A fresh scaffold has no skills authored by anyone other than the test user, which makes non-owner / Library Copy / non-owner visibility / non-admin stars flows untestable. Run `bash .claude/skills/builder-smoke-test/scripts/seed-multi-user.sh` (or with `--dir $PROJECT_DIR`) before sections 6 (Skills), 7 (Registry), and 10 (Stars). The script is idempotent and bypasses RBAC by writing directly to libsql, so it works regardless of `--auth` mode or current role. **Do not mark non-owner steps as "blocked" without running this first.**
5. **Execute the steps** — use `curl` for API checks (with `-H "Cookie: $COOKIE"` under `--auth on`), whichever browser tool the harness has wired up (Stagehand, Chrome MCP, etc.) for UI checks.
6. **Record results** in the summary table.
7. **Mark the section complete** with `task_write` before moving to the next.
### Partial testing (`--test`)
If `--test` is provided:
1. Always run **Setup**.
2. Run only the specified section(s).
3. Skip everything else.
Example: `--test skills,registry,agents` → Setup + Skills + Registry + Agents.
### Scope shortcuts (`--scope`)
`--scope` runs a curated group of related sections. Setup is always implied.
| Scope | Includes |
| -------- | ------------------------------------------------------------- |
| `rbac` | permissions, auth |
| `skills` | skills, registry, defaults |
| `agents` | agents, pickers, defaults, model-policy |
| `infra` | infrastructure, channels, reconciliation |
| `ui` | ui |
| `quick` | workspace, skills, agents, favorites, ui (skips long-running) |
`--scope` and `--test` can be combined; the union is run.
## Usage
```bash
# Full smoke (interactive)
/builder-smoke-test
# Specific sections
/builder-smoke-test --test workspace,skills
/builder-smoke-test --test agents,favorites
/builder-smoke-test --test reconciliation
/builder-smoke-test --test ui
# Scope shortcuts
/builder-smoke-test --scope rbac
/builder-smoke-test --scope skills
/builder-smoke-test --scope quick
# Force auth on / off (otherwise auto-detected from WORKOS_* env vars)
/builder-smoke-test --auth on
/builder-smoke-test --auth off
# Run auth-on as a non-admin role (must match the logged-in user's actual role)
/builder-smoke-test --auth on --role viewer
/builder-smoke-test --auth on --role member
# Skip the browser pass (API-only run)
/builder-smoke-test --skip-browser
```
## Parameters
| Parameter | Description | Default |
| ------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------- |
| `--test` | Comma-separated section names (see table above). | (all sections) |
| `--scope` | Named group of sections (`rbac`, `skills`, `agents`, `infra`, `ui`, `quick`). Combinable with `--test`. | (none) |
| `--auth` | `on`, `off`, or `auto`. `auto` enables the Auth section iff `WORKOS_CLIENT_ID` + `WORKOS_API_KEY` are set. | `auto` |
| `--role` | Expected role of the logged-in user under `--auth on`: `owner`, `admin`, `member`, or `viewer`. Setup asserts the live `/api/auth/me` roles match; on mismatch the run stops and the user is told to either change their WorkOS role or re-run with the correct `--role`. Ignored under `--auth off`. | `admin` |
| `--clean` | Delete test entities (smoke-test workspaces / agents / skills) at the end of each section. | `false` |
| `--skip-browser` | Run only API/`curl` checks. UI section is skipped. | `false` |
| `--dir` | Project directory the skill scaffolds into. Forwarded to `scripts/scaffold.sh`. Also reads `$BUILDER_SMOKE_TEST_DIR` from the environment when the flag is omitted. | `~/mastra-builder-smoke-tests/builder-smoke` |
| `--reuse` | If the project already exists at `$PROJECT_DIR` and has `node_modules/@mastra/core`, skip `pnpm install`. Forwarded to `scripts/scaffold.sh`. | `false` |
| `--openai-key` | OPENAI_API_KEY value to write into the scaffolded `.env`. If omitted, the scaffold script falls back to `$OPENAI_API_KEY` in the shell, then to an interactive prompt. | (shell or prompt) |
| `--workos-api-key`
`--workos-client-id`
`--workos-organization-id` | All three are required together to scaffold an auth-on project. Writes `AUTH_PROVIDER=workos` plus the three keys plus `WORKOS_REDIRECT_URI=http://localhost:4111/api/auth/callback` into `.env`. | (auth off) |
If `--auth auto` and no WorkOS env vars are present, the Auth section is auto-skipped and reported as `⏭️ Skipped (no WORKOS_* env vars)`.
### Canonical order
When running multiple sections, execute them in the order shown in the
section table (1 → 15). The order is intentional:
- **Setup** must run first — preflight + readiness probe gate every later
section.
- **Workspace / Reconciliation / Defaults / Model Policy** establish that
the server's view of the project matches what the rest of the run
assumes. Run them before any CRUD pass.
- **Skills → Registry → Agents → Pickers → Stars** is a build-up: agents
reference skills, pickers depend on the entities created above.
- **Permissions / Infrastructure / Channels / UI** are read-mostly
inspections that benefit from existing entities.
- **Auth** runs last because it requires restarting `mastra dev` with a
different `.env`.
If `--test` or `--scope` narrows the run, keep the relative order — just
skip the sections that fall outside the selection.
### Required vs optional reference tiers
References fall into three tiers; an agent should treat them
accordingly:
- **Required (every run):** `setup.md`. Any failure here blocks the rest
of the run.
- **Standard (default tiers for `full`, `quick`, scope shortcuts):**
`workspace.md`, `skills.md`, `agents.md`, `favorites.md`, `ui.md` (core),
`auth.md` when `--auth on`.
- **Extended (only when explicitly selected via `--test`/`--scope` or
the matching code surface changed):** `reconciliation.md`,
`defaults.md`, `model-policy.md`, `registry.md`, `picker-allowlist.md`,
`permissions.md`, `infrastructure.md`, `channels.md`, `ui.md` extended
tier.
When skipping an extended section, mark it `⏭️ Skipped (not in scope)`
in the result table — don't silently omit it.
### Cleanup
The scaffold is a self-contained throwaway directory at `$PROJECT_DIR`. All
fixture state (workspaces, agents, skills, libsql DB, `.mastra/workspace`
files) lives inside it. The smoke test never writes to anything outside
`$PROJECT_DIR` (other than the dev server it runs).
At the end of every run:
1. Stop the dev server (`kill $(lsof -i :4111 -sTCP:LISTEN -t)` or
foreground `Ctrl-C`).
2. Choose how to dispose of fixture state:
- **Reuse:** leave `$PROJECT_DIR` in place. The next run can pass
`--reuse` (or `--skip-scaffold` to preflight) and pick up where this
one left off. Fastest for iterating.
- **Reset:** `rm -rf "$PROJECT_DIR"` (or re-run `scripts/scaffold.sh`
without `--reuse`). Cheapest way to get back to a known-clean state.
Don't bother per-entity DELETE — the directory IS the state.
3. If a section bailed mid-flight (assertion failure, network error),
record the partial state in the report's **Issues** section so the
next run knows what to expect.
Per-entity DELETE calls are only needed when a specific section
explicitly tests DELETE behavior (those sections include the DELETE step
inline). Otherwise the throwaway-directory model handles cleanup.
Never leave the dev server running on `:4111` after the report is filed —
it blocks future runs.
## Prerequisites
- Working tree on the agent-builder feature branch (or any branch you want to QA).
- `pnpm` (10.x) and `node` on `$PATH`. The scaffold uses `pnpm install --ignore-workspace` inside the project dir so the repo-level workspace doesn't interfere.
- An `OPENAI_API_KEY`. Supply via `--openai-key`, export `OPENAI_API_KEY` in the shell, or let the scaffold prompt for it.
- (Optional) WorkOS credentials for `--auth on` runs: `--workos-api-key`, `--workos-client-id`, `--workos-organization-id`.
- Whichever browser MCP/tool the harness has access to. If none is available, run with `--skip-browser` and report UI as `⏭️ Skipped (no browser tool)`.
### Project layout (scaffolded for you)
```text
$PROJECT_DIR/ ← see "Project dir resolution" below
├── package.json ← pnpm overrides → link:/packages/*
├── tsconfig.json
├── .env ← OPENAI_API_KEY (+ AUTH_PROVIDER + WORKOS_* on auth-on)
└── src/mastra/
├── index.ts ← single Mastra instance, reads exported bindings from auth.ts
├── auth.ts ← top-level switch(process.env.AUTH_PROVIDER); no-op when unset
├── agents/index.ts ← weather-agent (gpt-4o-mini)
├── tools/index.ts ← weather-info tool
└── workflows/index.ts ← greet-workflow
```
The `.env` is the **only** thing that flips auth on/off — the same `src/mastra/index.ts` runs in both modes. Re-run `scripts/scaffold.sh` with or without `--workos-*` to switch.
### Project dir resolution
`$PROJECT_DIR` is determined by every script (scaffold, preflight, wait-for-server) using this order:
1. `--dir ` flag
2. `BUILDER_SMOKE_TEST_DIR` env var (e.g. `export BUILDER_SMOKE_TEST_DIR=~/code/builder-smoke`)
3. `~/mastra-builder-smoke-tests/builder-smoke` (default)
For a long-lived setup, exporting `BUILDER_SMOKE_TEST_DIR` once in your shell rc is the lowest-friction option — every script picks it up automatically.
### Running scripts (cwd matters)
All scripts under `.claude/skills/builder-smoke-test/scripts/` resolve the worktree root from their own location. They can be invoked from anywhere, but conventionally the repo root.
| Script | Run from | Notes |
| -------------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `scaffold.sh` | anywhere | Creates / refreshes `$PROJECT_DIR`. Forwards `--openai-key`, `--workos-*`, `--reuse`, `--dir`. |
| `preflight.sh` | anywhere | Calls `scaffold.sh` then asserts the resulting `.env` matches `--expect off\|on`. |
| `wait-for-server.sh` | anywhere | Hits `http://localhost:4111/api/agents`. cwd doesn't matter. |
| `seed-multi-user.sh` | anywhere | Inserts two skills owned by `user_seed_other` (1 public + 1 private) into the scaffold's libsql DB so non-owner / Library Copy flows can be tested without a second WorkOS account. Server must have booted at least once first. Idempotent. |
Invoke them as `bash .claude/skills/builder-smoke-test/scripts/.sh`. Don't `cd` into `scripts/` first — relative path resolution will break.
`pnpm mastra:dev` must be run from `$PROJECT_DIR` (where the scaffolded `package.json` is).
### How `mastra dev` reads env (important)
`mastra dev` loads `$PROJECT_DIR/.env` via dotenv and **unconditionally overwrites `process.env`** with whatever's there (`packages/cli/src/commands/dev/dev.ts` ~line 384). Practical consequences:
- **`.env` is the source of truth for the running server.** Inline overrides like `AUTH_PROVIDER= pnpm mastra:dev` are silently clobbered.
- **Shell-only vars survive only if `.env` has no entry for the same key.** Re-running `scripts/scaffold.sh` always overwrites `.env`, so to toggle modes, re-scaffold.
- **The auth mode the server actually runs in is determined by `.env` alone.** A globally exported `AUTH_PROVIDER=workos` in your shell does NOT enable WorkOS auth in the server if `.env` doesn't have it — but it WILL leak into anything else this process runs, which is its own kind of confusing. Preflight flags this case.
### Auth modes
Two states matter:
- **auth off** — `AUTH_PROVIDER` is absent (or blank) in `$PROJECT_DIR/.env`. No WorkOS, no RBAC, no FGA. This is the state for the auth-off run.
- **auth on** — `AUTH_PROVIDER=workos` plus `WORKOS_API_KEY`, `WORKOS_CLIENT_ID`, `WORKOS_ORGANIZATION_ID` all present in `$PROJECT_DIR/.env`. WorkOS authentication + role-based access + per-resource FGA all engage. This is the state for the auth-on runs. FGA is wired through the WorkOS auth provider — it can't be disabled independently.
To switch modes, re-run the scaffold with or without the `--workos-*` flags; that's faster and safer than hand-editing `.env`.
### Detection: run preflight before each section
```bash
# Scaffold (or refresh) the project and assert the auth-off baseline:
bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off \
--openai-key "$OPENAI_API_KEY"
# Scaffold an auth-on project (re-runs scaffold with WorkOS keys, asserts auth on):
bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect on \
--openai-key "$OPENAI_API_KEY" \
--workos-api-key "$WORKOS_API_KEY" \
--workos-client-id "$WORKOS_CLIENT_ID" \
--workos-organization-id "$WORKOS_ORGANIZATION_ID"
```
Preflight chains `scaffold.sh` followed by validation checks (project exists with `node_modules/@mastra/core`, `$PROJECT_DIR/.env` has `OPENAI_API_KEY`, optional WorkOS keys present when `--expect on`, and auth mode matches `--expect`). Each failure prints a stable error code; this table tells the agent what to do.
### Resolving missing env vars
If `scaffold.sh` or `preflight.sh` reports a missing `OPENAI_API_KEY` or `WORKOS_*` var, the agent must **not** silently source any rc file. Instead, work down this list and stop at the first one that resolves:
1. Check whether the var is already in the process env you can see (`echo "${OPENAI_API_KEY:-}"`). If yes, re-run scaffold with `--openai-key "$OPENAI_API_KEY"` (and equivalent for WorkOS).
2. Check whether the var is in `$PROJECT_DIR/.env` from a prior run (`grep -E "^(OPENAI_API_KEY|WORKOS_)" "$PROJECT_DIR/.env" 2>/dev/null`). If yes, you can pass `--reuse` to the next scaffold call.
3. If neither, look for rc files that exist on disk. Common candidates: `~/.zshrc`, `~/.bashrc`, `~/.zshenv`, `~/.profile`, `~/.env.global`, and any project-local `.env` you find. Use `ls -1` (or `test -f`) to confirm before listing — don't fabricate paths.
4. Ask the user in one message: "Can you paste the value(s), or give me permission to source one of these files?" Include the list of files that actually exist.
5. Only after the user explicitly approves a specific file, source it in a subshell and rerun preflight with the inherited env. Pattern:
```bash
# auth off
zsh -c 'source && bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off --reuse'
# auth on (preflight auto-picks WORKOS_API_KEY / WORKOS_CLIENT_ID / WORKOS_ORGANIZATION_ID from the sourced env)
zsh -c 'source && bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect on --reuse'
```
Use `bash -c` instead of `zsh -c` if the approved file is a bashrc.
6. Never write the secret value back into any rc file, never `export` it into the user's interactive shell, and never echo it back in chat in full. Refer to it as `` once you've used it.
| Error code | What it means | What the agent should do |
| ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| `project-dir-missing` | `$PROJECT_DIR` is unset or the directory does not exist (scaffold did not run, or was given a bad `--dir`). | Re-run preflight without `--skip-scaffold`, or pass an existing `--dir ` that scaffold has already populated. |
| `scaffold-failed` | `scripts/scaffold.sh` returned non-zero. | Re-run scaffold with `--no-reuse` to force a fresh install. Inspect the printed `pnpm install` output for the real error. |
| `project-deps-missing` | `$PROJECT_DIR/node_modules/@mastra/core` missing after scaffold. | Re-run scaffold without `--reuse` to force a fresh install. If that still fails, delete `$PROJECT_DIR` and re-run. |
| `openai-key-missing-in-project-env` | `$PROJECT_DIR/.env` has no usable `OPENAI_API_KEY`. | Follow the "Resolving missing env vars" section above. Re-run preflight with `--openai-key ` once you have it. |
| `workos-keys-missing-in-project-env` | `--expect on` but one or more of `WORKOS_API_KEY` / `WORKOS_CLIENT_ID` / `WORKOS_ORGANIZATION_ID` is absent or blank in `.env`. | Follow the "Resolving missing env vars" section above. Re-run preflight with all three `--workos-*` flags. |
| `mode-mismatch` | `--expect` disagrees with the auth mode detected from `$PROJECT_DIR/.env`. | Re-run the scaffold with (auth on) or without (auth off) `--workos-*` flags. The scaffold is idempotent for the parts that don't change. |
| `bad-expect-value` | `--expect` got something other than `off` or `on`. | Fix the invocation. (Parser also rejects flag-like values at parse time with exit 2.) |
**`.env` policy:** the scaffold **owns** `$PROJECT_DIR/.env`. Re-running scaffold overwrites it. Do not hand-edit the scaffolded `.env`; instead, re-run scaffold with different flags. (The skill never edits `.env` files outside `$PROJECT_DIR`.)
### Extracting the session cookie for curl (auth on)
The WorkOS session cookie is `httpOnly`, so `document.cookie` and Stagehand's
`extract` cannot read it from a normal page. To hit authenticated endpoints
from `curl` after a browser SSO login, the scaffold exposes a tiny debug
route gated by an env var:
1. Add `SMOKE_TEST_COOKIE_LEAK=1` to `$PROJECT_DIR/.env` (single line append; the scaffold leaves this var alone on re-run as long as the file already exists).
2. Restart `mastra dev` so the new env is picked up.
3. Sign in once in the Stagehand browser (`stagehand_navigate` to `http://localhost:4111`, complete WorkOS SSO).
4. From the same browser tab, navigate to `http://localhost:4111/smoke-test/cookie` and use `stagehand_extract` to read the page body. The page is a single `text/plain` line containing the request's `Cookie` header verbatim (e.g. `wos_session=…`).
5. Export it once: `export COOKIE=''`. From here on, every authenticated curl is `curl -H "Cookie: $COOKIE" "$BASE/…"`.
The route is **only registered when `SMOKE_TEST_COOKIE_LEAK=1`** and is intentionally insecure — never enable it in a real project. The `WORKOS_COOKIE_PASSWORD` written by the scaffold is derived from `$PROJECT_DIR`, so the cookie value stays valid across `mastra dev` restarts within the same scaffold; you only need to repeat step 4 if you re-scaffold to a new directory.
> **`/smoke-test/cookie` returns 404? Always an env-ordering issue.** The `apiRoutes` list is built once when `mastra dev` boots from `process.env.SMOKE_TEST_COOKIE_LEAK`. The flag has to be in `.env` **before** the boot — adding it after start has no effect until you restart. If you see a 404, run `grep SMOKE_TEST_COOKIE_LEAK "$PROJECT_DIR/.env"`, then stop and restart `mastra dev`. Don't pivot to "UI only" because of this.
### Seeding non-owner skills (Library Copy / non-owner flows)
A fresh scaffold has zero skills, and everything created through the API
is owned by either the auth-off "no caller" (no `authorId`) or the
currently signed-in user under auth-on. To exercise flows that require a
skill **owned by someone else** (Library Copy, non-owner read-only view,
private-skill visibility from a non-owner) without provisioning a second
WorkOS account, run the seed script after the server has booted at least
once:
```bash
# Start the server once so libsql initializes the skills tables.
cd $PROJECT_DIR
pnpm mastra:dev # leave running, then in another shell:
bash .claude/skills/builder-smoke-test/scripts/seed-multi-user.sh
# → seeds smoke-seed-public-skill (visibility=public, status=published)
# smoke-seed-private-skill (visibility=private, status=published)
# both owned by authorId='user_seed_other'
```
The script writes directly to `$PROJECT_DIR/src/mastra/public/mastra.db`
via the `sqlite3` CLI (no Node deps). It's idempotent — re-running
replaces the seeded rows. Use the seeded skills wherever a reference
file asks for "a skill owned by another user"; clean them up with
`DELETE` curls against `/api/stored/skills/:id` or by re-scaffolding.
## Starting the dev server
If the server is not running on `:4111`, the Setup section starts it. The convenience helpers live under `scripts/`:
```bash
# Scaffold + preflight (writes .env, installs deps, detects auth mode)
bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off
# Start the server from the scaffolded project
cd ~/mastra-builder-smoke-tests/builder-smoke
pnpm mastra:dev
# Poll /api/agents until 200 (60s budget). Detects mastra dev's port-bump.
bash .claude/skills/builder-smoke-test/scripts/wait-for-server.sh
```
`wait-for-server.sh` probes `/api/agents` — not `/` — because the SPA shell can return 200 before the API mounts. If it reports the server is up on `:4112`+ instead of `:4111`, `mastra dev` fell through to the next port; stop, free `:4111`, and restart. Continuing on a non-default port silently breaks every curl in every reference.
## API base URL
Every reference assumes `$BASE` is exported. Set it once at the start of the run:
```bash
export BASE=http://localhost:4111/api
```
All curl examples in the references use `$BASE` and won't work in a shell that hasn't exported it.
## Quick reference: key endpoints
This table lists the surfaces an agent will hit and where to look for the
authoritative request/response shape. Don't copy curl blocks from here —
run the per-section commands in `references/.md`.
| Surface | Endpoint |
| ----------------- | -------------------------------------------------------------------------- |
| Builder settings | `GET /editor/builder/settings` |
| Builder infra | `GET /editor/builder/infrastructure` |
| Registries (list) | `GET /editor/builder/registries` |
| Registry search | `GET /editor/builder/registries/:registryId/search?q=…` |
| Registry popular | `GET /editor/builder/registries/:registryId/popular` |
| Registry preview | `GET /editor/builder/registries/:registryId/preview?owner=…&repo=…&path=…` |
| Registry install | `POST /editor/builder/registries/:registryId/install` |
| Workspace CRUD | `GET/POST/PATCH/DELETE /stored/workspaces[/:id]` |
| Agent CRUD | `GET/POST/PATCH/DELETE /stored/agents[/:id]` |
| Agent favorite | `PUT / DELETE /stored/agents/:id/favorite` |
| Agent avatar | `PATCH /stored/agents/:id` with `metadata.avatarUrl` (owner-only) |
| Skill CRUD | `GET/POST/PATCH/DELETE /stored/skills[/:id]` |
| Skill publish | `POST /stored/skills/:id/publish` |
| Skill favorite | `PUT / DELETE /stored/skills/:id/favorite` |
| Auth me | `GET /api/auth/me` (returns logged-in user + roles + permissions) |
| Auth refresh | `POST /auth/refresh` |
## Builder Studio routes
| Feature | Route |
| ----------------------- | ------------------------------------------------------------------------------------------ |
| Agent Builder shell | `/agent-builder` |
| Agents (default view) | `/agent-builder` |
| Agent detail (view) | `/agent-builder/agents/:id/view` (bare `:id` redirects to `/view`) |
| Agent detail (edit) | `/agent-builder/agents/:id/edit` |
| Skills | `/agent-builder/skills` |
| Library (public skills) | `/agent-builder/library` |
| Skill detail | `/agent-builder/skills/:id/edit` (owner) or `/agent-builder/skills/:id/view` (non-owner) |
| Workspaces | `/agent-builder/workspaces` |
| Infrastructure | `/agent-builder/infrastructure` (readable by every default role — see `infrastructure.md`) |
Mobile renders a bottom-bar with the same primary entries.
## Browser smoke
Use whichever browser tool the harness has wired up (Stagehand, Chrome MCP, etc.). Don't assume a specific provider — discover what's available, then drive the same checklist in `references/ui.md`.
The scaffolded project registers `StagehandBrowser` (matching `examples/agent-builder`). If `BROWSERBASE_*` keys aren't set in the shell, Stagehand falls back to local Playwright; that's fine for smoke. If neither Stagehand nor a local browser is reachable, mark UI as `⏭️ Skipped (no browser provider)`.
## Result reporting
After testing, provide:
```md
## Builder Smoke Test Results
**Date**:
**Branch**:
**Commit**:
**Server**: scaffolded project @ localhost:4111 (`$PROJECT_DIR`)
**Auth**: on / off / auto-skipped
| # | Section | Status | Notes |
| --- | ------------------ | -------- | ------------------------------- |
| 1 | Setup | ✅/❌ | |
| 2 | Workspace | ✅/❌ | |
| 3 | Reconciliation | ✅/❌/⏭️ | |
| 4 | Defaults | ✅/❌ | |
| 5 | Model Policy | ✅/❌ | |
| 6 | Skills | ✅/❌ | |
| 7 | Registry | ✅/❌ | |
| 8 | Agents | ✅/❌ | |
| 9 | Pickers | ✅/❌ | |
| 10 | Stars | ✅/❌ | |
| 11 | Permissions / RBAC | ✅/❌ | |
| 12 | Infrastructure | ✅/❌ | |
| 13 | Channels | ✅/❌ | |
| 14 | UI | ✅/❌/⏭️ | |
| 15 | Auth | ✅/❌/⏭️ | (skipped if no WORKOS\_\* vars) |
**Product issues**: (list any — server/UI behaved unexpectedly. For each: HTTP method + path or UI route, expected vs actual, one-sentence guess at the cause. Do not pre-decide "known bug" — log what the server actually did. Say "none" if empty.)
**Skill issues**: (list any — the skill itself was wrong, unclear, stale, or unreachable. For each: which file + step (e.g. `references/skills.md` step F2), and what was wrong. Doc drift, not product bugs. Say "none" if empty.)
**Verify before filing.** Before adding anything to either list, re-confirm against the live response in this run, not memory of an earlier call:
- For any **shape mismatch / missing field / wrong key name** claim, paste the actual JSON fragment (or the relevant keys) directly under the bullet so the claim is reproducible. If the skill says `features.agent.skills` and the response has `features.agent.skills`, that is not a skill issue — names that look similar in passing (`featSkills`, `agent.features.skill`, etc.) are easy to misread.
- For any **endpoint inconsistency** claim (e.g. "endpoint A returns X but B returns Y"), re-curl both endpoints fresh in the same run rather than reusing a stale response from earlier in the section.
- For any **RBAC / authz** claim (403 where you expected 200, or vice versa), check `references/permissions.md` for the matrix _and_ check the "Design decisions" list in this file. Several roles intentionally share `*:read`, which means infra/list/get endpoints look "ungated" but are working as intended. Also confirm the cookie you sent belongs to the role you think it does (`curl -H "Cookie: $(cat /tmp/cookie.txt)" $BASE/auth/me | jq '.role // .roles'`).
- For any **missing endpoint** claim (e.g. "agent avatar 404"), confirm the contract first — several flows are client-composed on top of generic CRUD (avatar = `PATCH metadata.avatarUrl`; Library Copy = `POST /stored/skills` with `metadata.origin`). The "Design decisions (don't file as bugs)" section enumerates the common ones.
- If a claim can't be reproduced on a fresh request, drop it.
**Regressions**: (list any behavioral changes from a previous run)
**Warnings**: (e.g., dev-server crash on `/auth/refresh` polling, OPENAI_API_KEY required at startup)
**Skipped sections**: (list with reason)
```
## Known rough edges
The branch has accumulated minor papercuts. Note these in your report only if you hit them; don't fail the run on them:
- Don't `rm` `$PROJECT_DIR/mastra.db` by hand while the server is up — stop the server first, then delete.
- Dev server can crash on hot-reload from `/auth/refresh` polling. Restart and continue.
- `OPENAI_API_KEY` is required at startup — server won't boot without it, even if you only test non-LLM surfaces.
- `mastra dev` overwrites `process.env` from `.env` at boot, so inline env overrides on the command line don't reach the server. Re-run scaffold to change `.env`.
- The scaffold links against the **current worktree's** packages via `link:` overrides. If you switch worktrees, re-run scaffold so the symlinks point at the right tree.
## Design decisions (don't file as bugs)
These have come up across multiple runs and are intentional. If you observe one, note it in your report as "expected behavior" — do **not** open a product issue.
- **`GET /auth/me` without a cookie returns `200` with a `null`-ish body.** The route is mounted as a public route (`createPublicRoute`); the contract is "return the current user or `null`", not "401 if missing". A `401` here would break the public app shell.
- **`/editor/builder/infrastructure` is readable by every default role (admin / member / viewer).** The handler gates on `infrastructure:read` and every default role has `*:read`, which matches by resource-wildcard. The page only exposes deployment-shape data (provider names, registered flags, configured/unconfigured booleans) — no secrets.
- **Flipping a skill's `visibility` from `private` to `public` does not auto-publish unless the skill has a registered `skillPath`.** Visibility and publication are independent fields by design. A plain-create skill flipped public stays at `activeVersionId: null` until a real `POST /publish` runs against a source path.
- **Zod schema validation runs before the permission middleware on `/stored/*` writes.** A malformed body from a viewer returns a 400, not a 403. This is standard request lifecycle; the response surface doesn't leak resource state.
- **The role-impersonation picker only lists roles _different from the current one_.** Logged in as `admin`, you'll see `Member` and `Viewer` and nothing else — there is no `Admin` self-item. This is intentional (admin is the baseline; you're already there).
- **Impersonation is UI-only.** The API still answers per the real logged-in role. A `curl` while impersonating `viewer` will still return the admin's response.
- **`Favorites` sidebar entry links to `/agent-builder/favorite` (singular).** The plural `/favorites` is not a registered route and renders the React Router 404. Use the sidebar link or the singular URL when scripting.
- **Avatar upload uses agent `PATCH` with `metadata.avatarUrl`, not a dedicated `/avatar` endpoint.** See `references/agents.md`.
- **Copy is client-side.** There is no `POST /stored/skills/:id/copy`. The UI fetches the source skill and POSTs a new row to `/stored/skills` with `metadata.origin = "library-copy"`. See `references/registry.md`.
## Out of smoke-test scope
Some flows are documented in `references/` but are not driven by the smoke-test agent because they require server-lifecycle gymnastics that don't fit a single run:
- **Reconciliation steps 2/3/4/6** (`references/reconciliation.md`) require editing `$PROJECT_DIR/src/mastra/index.ts` (changing `basePath` / `workspaceId` / config), restarting `mastra dev` multiple times, and observing drift detection or orphan archival across restarts. The smoke-test agent runs only **Step 1** (fresh-startup persistence) and **Step 5** (non-builder workspaces untouched). Run the rest by hand when changing reconciliation code.
- **Real role-swap testing** (logging in as multiple WorkOS users with different roles in the same run) is out of scope. The agent verifies whichever role the live `--role` user actually has, and additionally exercises the **UI-only role impersonation** flow under `--role admin` (see `references/ui.md`).
## References
- `references/setup.md` — server health, builder settings sanity, baseline counts, builder workspace existence
- `references/workspace.md` — workspace CRUD via API
- `references/reconciliation.md` — config-driven workspace lifecycle (fresh, idempotent, drift, archival, backfill)
- `references/defaults.md` — builder defaults applied at agent create (memory, workspace, browser, model)
- `references/model-policy.md` — allowed list, default model, dropdown filtering, rejection
- `references/skills.md` — skill CRUD, visibility, publish, filesystem writes, files array
- `references/registry.md` — skills.sh browse/install, library Copy flow, origin badges, gating
- `references/agents.md` — stored agent CRUD, skill attachment, model swap, delete-from-edit, avatar upload
- `references/picker-allowlist.md` — tools/agents/workflows pickers respect allowlists
- `references/favorites.md` — favorite/unfavorite agents and skills, idempotency (formerly `stars.md`)
- `references/permissions.md` — viewer/member/admin/owner gating, role expectation matrix, UI impersonation, auth-off bypass
- `references/infrastructure.md` — `/editor/builder/infrastructure` payload + UI
- `references/channels.md` — Slack provider visibility, connectChannel tool
- `references/ui.md` — browser checklist across Builder routes
- `references/auth.md` — WorkOS on/off, 401 behavior, authorId, mode-toggle via `.env`
- `scripts/scaffold.sh` — scaffold or refresh the hermetic project at `$PROJECT_DIR`
- `scripts/preflight.sh` — wraps `scaffold.sh` + mode expectation (`--expect off|on`)
- `scripts/wait-for-server.sh` — poll `:4111` until healthy