--- name: builder-smoke-test description: Smoke test the Agent Builder feature branch end-to-end against a hermetic project scaffolded by the skill (linked to the current worktree). Covers workspace reconciliation, stored agents/skills CRUD, ownership, visibility, stars, registry/library Copy flow, picker allowlists, model policy, RBAC role gating, role impersonation UI, builder defaults, infrastructure diagnostics, channels, and Studio + Agent Builder UI. Trigger when validating the agent-builder feature branch, PRs that touch packages/server, packages/playground, packages/playground-ui agent-builder routes, or builder EE code paths. --- # Builder Smoke Test End-to-end smoke testing of the Agent Builder feature set against a hermetic project the skill scaffolds at `~/mastra-builder-smoke-tests/builder-smoke` (configurable). The project links to the current worktree via `pnpm` `link:` overrides, so changes to packages under `packages/`, `stores/`, `auth/`, `channels/`, `observability/`, `browser/`, and `client-sdks/` take effect on the next `mastra dev` restart. This skill is for **branch QA** — it complements the release-time `mastra-smoke-test`. It exercises the Builder EE surface (stored entities, RBAC, registry, infra, channels) using a minimal, predictable project rather than the kitchen-sink `examples/agent`. ## ⚠️ Mandatory Test Checklist **Use `task_write` to track progress.** Run ALL sections unless `--test` or `--scope` narrows the run. **Do not skip sections unless you hit an actual blocker.** "Seemed complex" or "I'll come back to it" are not valid reasons. Attempt every step — only stop when you literally cannot proceed. Report what you tried and what blocked you. | # | Section | Reference | When required | | --- | ---------------------- | -------------------------------- | ----------------------------------------------------------------------- | | 1 | **Setup** | `references/setup.md` | Always | | 2 | **Workspace** | `references/workspace.md` | `--test workspace` or full | | 3 | **Reconciliation** | `references/reconciliation.md` | Steps 1 + 5 only; steps 2/3/4/6 are out of smoke-test scope (see below) | | 4 | **Defaults** | `references/defaults.md` | `--test defaults` or full | | 5 | **Model Policy** | `references/model-policy.md` | `--test model-policy` or full | | 6 | **Skills** | `references/skills.md` | `--test skills` or full | | 7 | **Registry** | `references/registry.md` | `--test registry` or full | | 8 | **Agents** | `references/agents.md` | `--test agents` or full | | 9 | **Picker Allowlists** | `references/picker-allowlist.md` | `--test pickers` or full | | 10 | **Favorites** | `references/favorites.md` | `--test favorites` or full (formerly `stars`) | | 11 | **Permissions / RBAC** | `references/permissions.md` | `--test permissions` or full | | 12 | **Infrastructure** | `references/infrastructure.md` | `--test infrastructure` or full | | 13 | **Channels** | `references/channels.md` | `--test channels` or full | | 14 | **UI** | `references/ui.md` | `--test ui` or full | | 15 | **Auth** | `references/auth.md` | `--test auth` or `--auth on` | ### Execution flow 1. **Confirm the project directory.** Before scaffolding, ask the user where they want `$PROJECT_DIR` to live. Offer the default (`~/mastra-builder-smoke-tests/builder-smoke`) as a suggestion. Skip the question if they already passed `--dir` or have `$BUILDER_SMOKE_TEST_DIR` exported. See `references/setup.md` step 0. 2. **Read the reference file** for each section you're about to run. 3. **Under `--auth on`, extract the session cookie before running any other section.** The WorkOS cookie is `httpOnly`, so `curl` cannot mint it and `document.cookie` cannot read it. The scaffold ships a debug route at `GET /smoke-test/cookie` gated by `SMOKE_TEST_COOKIE_LEAK=1`. Follow the **"Extracting the session cookie for curl (auth on)"** section below before touching any auth-on endpoint. **Do not pivot to UI-only testing because curl is "blocked" — the cookie route is the unblock path.** 4. **Seed non-owner data after the server has booted at least once.** A fresh scaffold has no skills authored by anyone other than the test user, which makes non-owner / Library Copy / non-owner visibility / non-admin stars flows untestable. Run `bash .claude/skills/builder-smoke-test/scripts/seed-multi-user.sh` (or with `--dir $PROJECT_DIR`) before sections 6 (Skills), 7 (Registry), and 10 (Stars). The script is idempotent and bypasses RBAC by writing directly to libsql, so it works regardless of `--auth` mode or current role. **Do not mark non-owner steps as "blocked" without running this first.** 5. **Execute the steps** — use `curl` for API checks (with `-H "Cookie: $COOKIE"` under `--auth on`), whichever browser tool the harness has wired up (Stagehand, Chrome MCP, etc.) for UI checks. 6. **Record results** in the summary table. 7. **Mark the section complete** with `task_write` before moving to the next. ### Partial testing (`--test`) If `--test` is provided: 1. Always run **Setup**. 2. Run only the specified section(s). 3. Skip everything else. Example: `--test skills,registry,agents` → Setup + Skills + Registry + Agents. ### Scope shortcuts (`--scope`) `--scope` runs a curated group of related sections. Setup is always implied. | Scope | Includes | | -------- | ------------------------------------------------------------- | | `rbac` | permissions, auth | | `skills` | skills, registry, defaults | | `agents` | agents, pickers, defaults, model-policy | | `infra` | infrastructure, channels, reconciliation | | `ui` | ui | | `quick` | workspace, skills, agents, favorites, ui (skips long-running) | `--scope` and `--test` can be combined; the union is run. ## Usage ```bash # Full smoke (interactive) /builder-smoke-test # Specific sections /builder-smoke-test --test workspace,skills /builder-smoke-test --test agents,favorites /builder-smoke-test --test reconciliation /builder-smoke-test --test ui # Scope shortcuts /builder-smoke-test --scope rbac /builder-smoke-test --scope skills /builder-smoke-test --scope quick # Force auth on / off (otherwise auto-detected from WORKOS_* env vars) /builder-smoke-test --auth on /builder-smoke-test --auth off # Run auth-on as a non-admin role (must match the logged-in user's actual role) /builder-smoke-test --auth on --role viewer /builder-smoke-test --auth on --role member # Skip the browser pass (API-only run) /builder-smoke-test --skip-browser ``` ## Parameters | Parameter | Description | Default | | ------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------- | | `--test` | Comma-separated section names (see table above). | (all sections) | | `--scope` | Named group of sections (`rbac`, `skills`, `agents`, `infra`, `ui`, `quick`). Combinable with `--test`. | (none) | | `--auth` | `on`, `off`, or `auto`. `auto` enables the Auth section iff `WORKOS_CLIENT_ID` + `WORKOS_API_KEY` are set. | `auto` | | `--role` | Expected role of the logged-in user under `--auth on`: `owner`, `admin`, `member`, or `viewer`. Setup asserts the live `/api/auth/me` roles match; on mismatch the run stops and the user is told to either change their WorkOS role or re-run with the correct `--role`. Ignored under `--auth off`. | `admin` | | `--clean` | Delete test entities (smoke-test workspaces / agents / skills) at the end of each section. | `false` | | `--skip-browser` | Run only API/`curl` checks. UI section is skipped. | `false` | | `--dir` | Project directory the skill scaffolds into. Forwarded to `scripts/scaffold.sh`. Also reads `$BUILDER_SMOKE_TEST_DIR` from the environment when the flag is omitted. | `~/mastra-builder-smoke-tests/builder-smoke` | | `--reuse` | If the project already exists at `$PROJECT_DIR` and has `node_modules/@mastra/core`, skip `pnpm install`. Forwarded to `scripts/scaffold.sh`. | `false` | | `--openai-key` | OPENAI_API_KEY value to write into the scaffolded `.env`. If omitted, the scaffold script falls back to `$OPENAI_API_KEY` in the shell, then to an interactive prompt. | (shell or prompt) | | `--workos-api-key`
`--workos-client-id`
`--workos-organization-id` | All three are required together to scaffold an auth-on project. Writes `AUTH_PROVIDER=workos` plus the three keys plus `WORKOS_REDIRECT_URI=http://localhost:4111/api/auth/callback` into `.env`. | (auth off) | If `--auth auto` and no WorkOS env vars are present, the Auth section is auto-skipped and reported as `⏭️ Skipped (no WORKOS_* env vars)`. ### Canonical order When running multiple sections, execute them in the order shown in the section table (1 → 15). The order is intentional: - **Setup** must run first — preflight + readiness probe gate every later section. - **Workspace / Reconciliation / Defaults / Model Policy** establish that the server's view of the project matches what the rest of the run assumes. Run them before any CRUD pass. - **Skills → Registry → Agents → Pickers → Stars** is a build-up: agents reference skills, pickers depend on the entities created above. - **Permissions / Infrastructure / Channels / UI** are read-mostly inspections that benefit from existing entities. - **Auth** runs last because it requires restarting `mastra dev` with a different `.env`. If `--test` or `--scope` narrows the run, keep the relative order — just skip the sections that fall outside the selection. ### Required vs optional reference tiers References fall into three tiers; an agent should treat them accordingly: - **Required (every run):** `setup.md`. Any failure here blocks the rest of the run. - **Standard (default tiers for `full`, `quick`, scope shortcuts):** `workspace.md`, `skills.md`, `agents.md`, `favorites.md`, `ui.md` (core), `auth.md` when `--auth on`. - **Extended (only when explicitly selected via `--test`/`--scope` or the matching code surface changed):** `reconciliation.md`, `defaults.md`, `model-policy.md`, `registry.md`, `picker-allowlist.md`, `permissions.md`, `infrastructure.md`, `channels.md`, `ui.md` extended tier. When skipping an extended section, mark it `⏭️ Skipped (not in scope)` in the result table — don't silently omit it. ### Cleanup The scaffold is a self-contained throwaway directory at `$PROJECT_DIR`. All fixture state (workspaces, agents, skills, libsql DB, `.mastra/workspace` files) lives inside it. The smoke test never writes to anything outside `$PROJECT_DIR` (other than the dev server it runs). At the end of every run: 1. Stop the dev server (`kill $(lsof -i :4111 -sTCP:LISTEN -t)` or foreground `Ctrl-C`). 2. Choose how to dispose of fixture state: - **Reuse:** leave `$PROJECT_DIR` in place. The next run can pass `--reuse` (or `--skip-scaffold` to preflight) and pick up where this one left off. Fastest for iterating. - **Reset:** `rm -rf "$PROJECT_DIR"` (or re-run `scripts/scaffold.sh` without `--reuse`). Cheapest way to get back to a known-clean state. Don't bother per-entity DELETE — the directory IS the state. 3. If a section bailed mid-flight (assertion failure, network error), record the partial state in the report's **Issues** section so the next run knows what to expect. Per-entity DELETE calls are only needed when a specific section explicitly tests DELETE behavior (those sections include the DELETE step inline). Otherwise the throwaway-directory model handles cleanup. Never leave the dev server running on `:4111` after the report is filed — it blocks future runs. ## Prerequisites - Working tree on the agent-builder feature branch (or any branch you want to QA). - `pnpm` (10.x) and `node` on `$PATH`. The scaffold uses `pnpm install --ignore-workspace` inside the project dir so the repo-level workspace doesn't interfere. - An `OPENAI_API_KEY`. Supply via `--openai-key`, export `OPENAI_API_KEY` in the shell, or let the scaffold prompt for it. - (Optional) WorkOS credentials for `--auth on` runs: `--workos-api-key`, `--workos-client-id`, `--workos-organization-id`. - Whichever browser MCP/tool the harness has access to. If none is available, run with `--skip-browser` and report UI as `⏭️ Skipped (no browser tool)`. ### Project layout (scaffolded for you) ```text $PROJECT_DIR/ ← see "Project dir resolution" below ├── package.json ← pnpm overrides → link:/packages/* ├── tsconfig.json ├── .env ← OPENAI_API_KEY (+ AUTH_PROVIDER + WORKOS_* on auth-on) └── src/mastra/ ├── index.ts ← single Mastra instance, reads exported bindings from auth.ts ├── auth.ts ← top-level switch(process.env.AUTH_PROVIDER); no-op when unset ├── agents/index.ts ← weather-agent (gpt-4o-mini) ├── tools/index.ts ← weather-info tool └── workflows/index.ts ← greet-workflow ``` The `.env` is the **only** thing that flips auth on/off — the same `src/mastra/index.ts` runs in both modes. Re-run `scripts/scaffold.sh` with or without `--workos-*` to switch. ### Project dir resolution `$PROJECT_DIR` is determined by every script (scaffold, preflight, wait-for-server) using this order: 1. `--dir ` flag 2. `BUILDER_SMOKE_TEST_DIR` env var (e.g. `export BUILDER_SMOKE_TEST_DIR=~/code/builder-smoke`) 3. `~/mastra-builder-smoke-tests/builder-smoke` (default) For a long-lived setup, exporting `BUILDER_SMOKE_TEST_DIR` once in your shell rc is the lowest-friction option — every script picks it up automatically. ### Running scripts (cwd matters) All scripts under `.claude/skills/builder-smoke-test/scripts/` resolve the worktree root from their own location. They can be invoked from anywhere, but conventionally the repo root. | Script | Run from | Notes | | -------------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `scaffold.sh` | anywhere | Creates / refreshes `$PROJECT_DIR`. Forwards `--openai-key`, `--workos-*`, `--reuse`, `--dir`. | | `preflight.sh` | anywhere | Calls `scaffold.sh` then asserts the resulting `.env` matches `--expect off\|on`. | | `wait-for-server.sh` | anywhere | Hits `http://localhost:4111/api/agents`. cwd doesn't matter. | | `seed-multi-user.sh` | anywhere | Inserts two skills owned by `user_seed_other` (1 public + 1 private) into the scaffold's libsql DB so non-owner / Library Copy flows can be tested without a second WorkOS account. Server must have booted at least once first. Idempotent. | Invoke them as `bash .claude/skills/builder-smoke-test/scripts/.sh`. Don't `cd` into `scripts/` first — relative path resolution will break. `pnpm mastra:dev` must be run from `$PROJECT_DIR` (where the scaffolded `package.json` is). ### How `mastra dev` reads env (important) `mastra dev` loads `$PROJECT_DIR/.env` via dotenv and **unconditionally overwrites `process.env`** with whatever's there (`packages/cli/src/commands/dev/dev.ts` ~line 384). Practical consequences: - **`.env` is the source of truth for the running server.** Inline overrides like `AUTH_PROVIDER= pnpm mastra:dev` are silently clobbered. - **Shell-only vars survive only if `.env` has no entry for the same key.** Re-running `scripts/scaffold.sh` always overwrites `.env`, so to toggle modes, re-scaffold. - **The auth mode the server actually runs in is determined by `.env` alone.** A globally exported `AUTH_PROVIDER=workos` in your shell does NOT enable WorkOS auth in the server if `.env` doesn't have it — but it WILL leak into anything else this process runs, which is its own kind of confusing. Preflight flags this case. ### Auth modes Two states matter: - **auth off** — `AUTH_PROVIDER` is absent (or blank) in `$PROJECT_DIR/.env`. No WorkOS, no RBAC, no FGA. This is the state for the auth-off run. - **auth on** — `AUTH_PROVIDER=workos` plus `WORKOS_API_KEY`, `WORKOS_CLIENT_ID`, `WORKOS_ORGANIZATION_ID` all present in `$PROJECT_DIR/.env`. WorkOS authentication + role-based access + per-resource FGA all engage. This is the state for the auth-on runs. FGA is wired through the WorkOS auth provider — it can't be disabled independently. To switch modes, re-run the scaffold with or without the `--workos-*` flags; that's faster and safer than hand-editing `.env`. ### Detection: run preflight before each section ```bash # Scaffold (or refresh) the project and assert the auth-off baseline: bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off \ --openai-key "$OPENAI_API_KEY" # Scaffold an auth-on project (re-runs scaffold with WorkOS keys, asserts auth on): bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect on \ --openai-key "$OPENAI_API_KEY" \ --workos-api-key "$WORKOS_API_KEY" \ --workos-client-id "$WORKOS_CLIENT_ID" \ --workos-organization-id "$WORKOS_ORGANIZATION_ID" ``` Preflight chains `scaffold.sh` followed by validation checks (project exists with `node_modules/@mastra/core`, `$PROJECT_DIR/.env` has `OPENAI_API_KEY`, optional WorkOS keys present when `--expect on`, and auth mode matches `--expect`). Each failure prints a stable error code; this table tells the agent what to do. ### Resolving missing env vars If `scaffold.sh` or `preflight.sh` reports a missing `OPENAI_API_KEY` or `WORKOS_*` var, the agent must **not** silently source any rc file. Instead, work down this list and stop at the first one that resolves: 1. Check whether the var is already in the process env you can see (`echo "${OPENAI_API_KEY:-}"`). If yes, re-run scaffold with `--openai-key "$OPENAI_API_KEY"` (and equivalent for WorkOS). 2. Check whether the var is in `$PROJECT_DIR/.env` from a prior run (`grep -E "^(OPENAI_API_KEY|WORKOS_)" "$PROJECT_DIR/.env" 2>/dev/null`). If yes, you can pass `--reuse` to the next scaffold call. 3. If neither, look for rc files that exist on disk. Common candidates: `~/.zshrc`, `~/.bashrc`, `~/.zshenv`, `~/.profile`, `~/.env.global`, and any project-local `.env` you find. Use `ls -1` (or `test -f`) to confirm before listing — don't fabricate paths. 4. Ask the user in one message: "Can you paste the value(s), or give me permission to source one of these files?" Include the list of files that actually exist. 5. Only after the user explicitly approves a specific file, source it in a subshell and rerun preflight with the inherited env. Pattern: ```bash # auth off zsh -c 'source && bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off --reuse' # auth on (preflight auto-picks WORKOS_API_KEY / WORKOS_CLIENT_ID / WORKOS_ORGANIZATION_ID from the sourced env) zsh -c 'source && bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect on --reuse' ``` Use `bash -c` instead of `zsh -c` if the approved file is a bashrc. 6. Never write the secret value back into any rc file, never `export` it into the user's interactive shell, and never echo it back in chat in full. Refer to it as `` once you've used it. | Error code | What it means | What the agent should do | | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | | `project-dir-missing` | `$PROJECT_DIR` is unset or the directory does not exist (scaffold did not run, or was given a bad `--dir`). | Re-run preflight without `--skip-scaffold`, or pass an existing `--dir ` that scaffold has already populated. | | `scaffold-failed` | `scripts/scaffold.sh` returned non-zero. | Re-run scaffold with `--no-reuse` to force a fresh install. Inspect the printed `pnpm install` output for the real error. | | `project-deps-missing` | `$PROJECT_DIR/node_modules/@mastra/core` missing after scaffold. | Re-run scaffold without `--reuse` to force a fresh install. If that still fails, delete `$PROJECT_DIR` and re-run. | | `openai-key-missing-in-project-env` | `$PROJECT_DIR/.env` has no usable `OPENAI_API_KEY`. | Follow the "Resolving missing env vars" section above. Re-run preflight with `--openai-key ` once you have it. | | `workos-keys-missing-in-project-env` | `--expect on` but one or more of `WORKOS_API_KEY` / `WORKOS_CLIENT_ID` / `WORKOS_ORGANIZATION_ID` is absent or blank in `.env`. | Follow the "Resolving missing env vars" section above. Re-run preflight with all three `--workos-*` flags. | | `mode-mismatch` | `--expect` disagrees with the auth mode detected from `$PROJECT_DIR/.env`. | Re-run the scaffold with (auth on) or without (auth off) `--workos-*` flags. The scaffold is idempotent for the parts that don't change. | | `bad-expect-value` | `--expect` got something other than `off` or `on`. | Fix the invocation. (Parser also rejects flag-like values at parse time with exit 2.) | **`.env` policy:** the scaffold **owns** `$PROJECT_DIR/.env`. Re-running scaffold overwrites it. Do not hand-edit the scaffolded `.env`; instead, re-run scaffold with different flags. (The skill never edits `.env` files outside `$PROJECT_DIR`.) ### Extracting the session cookie for curl (auth on) The WorkOS session cookie is `httpOnly`, so `document.cookie` and Stagehand's `extract` cannot read it from a normal page. To hit authenticated endpoints from `curl` after a browser SSO login, the scaffold exposes a tiny debug route gated by an env var: 1. Add `SMOKE_TEST_COOKIE_LEAK=1` to `$PROJECT_DIR/.env` (single line append; the scaffold leaves this var alone on re-run as long as the file already exists). 2. Restart `mastra dev` so the new env is picked up. 3. Sign in once in the Stagehand browser (`stagehand_navigate` to `http://localhost:4111`, complete WorkOS SSO). 4. From the same browser tab, navigate to `http://localhost:4111/smoke-test/cookie` and use `stagehand_extract` to read the page body. The page is a single `text/plain` line containing the request's `Cookie` header verbatim (e.g. `wos_session=…`). 5. Export it once: `export COOKIE=''`. From here on, every authenticated curl is `curl -H "Cookie: $COOKIE" "$BASE/…"`. The route is **only registered when `SMOKE_TEST_COOKIE_LEAK=1`** and is intentionally insecure — never enable it in a real project. The `WORKOS_COOKIE_PASSWORD` written by the scaffold is derived from `$PROJECT_DIR`, so the cookie value stays valid across `mastra dev` restarts within the same scaffold; you only need to repeat step 4 if you re-scaffold to a new directory. > **`/smoke-test/cookie` returns 404? Always an env-ordering issue.** The `apiRoutes` list is built once when `mastra dev` boots from `process.env.SMOKE_TEST_COOKIE_LEAK`. The flag has to be in `.env` **before** the boot — adding it after start has no effect until you restart. If you see a 404, run `grep SMOKE_TEST_COOKIE_LEAK "$PROJECT_DIR/.env"`, then stop and restart `mastra dev`. Don't pivot to "UI only" because of this. ### Seeding non-owner skills (Library Copy / non-owner flows) A fresh scaffold has zero skills, and everything created through the API is owned by either the auth-off "no caller" (no `authorId`) or the currently signed-in user under auth-on. To exercise flows that require a skill **owned by someone else** (Library Copy, non-owner read-only view, private-skill visibility from a non-owner) without provisioning a second WorkOS account, run the seed script after the server has booted at least once: ```bash # Start the server once so libsql initializes the skills tables. cd $PROJECT_DIR pnpm mastra:dev # leave running, then in another shell: bash .claude/skills/builder-smoke-test/scripts/seed-multi-user.sh # → seeds smoke-seed-public-skill (visibility=public, status=published) # smoke-seed-private-skill (visibility=private, status=published) # both owned by authorId='user_seed_other' ``` The script writes directly to `$PROJECT_DIR/src/mastra/public/mastra.db` via the `sqlite3` CLI (no Node deps). It's idempotent — re-running replaces the seeded rows. Use the seeded skills wherever a reference file asks for "a skill owned by another user"; clean them up with `DELETE` curls against `/api/stored/skills/:id` or by re-scaffolding. ## Starting the dev server If the server is not running on `:4111`, the Setup section starts it. The convenience helpers live under `scripts/`: ```bash # Scaffold + preflight (writes .env, installs deps, detects auth mode) bash .claude/skills/builder-smoke-test/scripts/preflight.sh --expect off # Start the server from the scaffolded project cd ~/mastra-builder-smoke-tests/builder-smoke pnpm mastra:dev # Poll /api/agents until 200 (60s budget). Detects mastra dev's port-bump. bash .claude/skills/builder-smoke-test/scripts/wait-for-server.sh ``` `wait-for-server.sh` probes `/api/agents` — not `/` — because the SPA shell can return 200 before the API mounts. If it reports the server is up on `:4112`+ instead of `:4111`, `mastra dev` fell through to the next port; stop, free `:4111`, and restart. Continuing on a non-default port silently breaks every curl in every reference. ## API base URL Every reference assumes `$BASE` is exported. Set it once at the start of the run: ```bash export BASE=http://localhost:4111/api ``` All curl examples in the references use `$BASE` and won't work in a shell that hasn't exported it. ## Quick reference: key endpoints This table lists the surfaces an agent will hit and where to look for the authoritative request/response shape. Don't copy curl blocks from here — run the per-section commands in `references/
.md`. | Surface | Endpoint | | ----------------- | -------------------------------------------------------------------------- | | Builder settings | `GET /editor/builder/settings` | | Builder infra | `GET /editor/builder/infrastructure` | | Registries (list) | `GET /editor/builder/registries` | | Registry search | `GET /editor/builder/registries/:registryId/search?q=…` | | Registry popular | `GET /editor/builder/registries/:registryId/popular` | | Registry preview | `GET /editor/builder/registries/:registryId/preview?owner=…&repo=…&path=…` | | Registry install | `POST /editor/builder/registries/:registryId/install` | | Workspace CRUD | `GET/POST/PATCH/DELETE /stored/workspaces[/:id]` | | Agent CRUD | `GET/POST/PATCH/DELETE /stored/agents[/:id]` | | Agent favorite | `PUT / DELETE /stored/agents/:id/favorite` | | Agent avatar | `PATCH /stored/agents/:id` with `metadata.avatarUrl` (owner-only) | | Skill CRUD | `GET/POST/PATCH/DELETE /stored/skills[/:id]` | | Skill publish | `POST /stored/skills/:id/publish` | | Skill favorite | `PUT / DELETE /stored/skills/:id/favorite` | | Auth me | `GET /api/auth/me` (returns logged-in user + roles + permissions) | | Auth refresh | `POST /auth/refresh` | ## Builder Studio routes | Feature | Route | | ----------------------- | ------------------------------------------------------------------------------------------ | | Agent Builder shell | `/agent-builder` | | Agents (default view) | `/agent-builder` | | Agent detail (view) | `/agent-builder/agents/:id/view` (bare `:id` redirects to `/view`) | | Agent detail (edit) | `/agent-builder/agents/:id/edit` | | Skills | `/agent-builder/skills` | | Library (public skills) | `/agent-builder/library` | | Skill detail | `/agent-builder/skills/:id/edit` (owner) or `/agent-builder/skills/:id/view` (non-owner) | | Workspaces | `/agent-builder/workspaces` | | Infrastructure | `/agent-builder/infrastructure` (readable by every default role — see `infrastructure.md`) | Mobile renders a bottom-bar with the same primary entries. ## Browser smoke Use whichever browser tool the harness has wired up (Stagehand, Chrome MCP, etc.). Don't assume a specific provider — discover what's available, then drive the same checklist in `references/ui.md`. The scaffolded project registers `StagehandBrowser` (matching `examples/agent-builder`). If `BROWSERBASE_*` keys aren't set in the shell, Stagehand falls back to local Playwright; that's fine for smoke. If neither Stagehand nor a local browser is reachable, mark UI as `⏭️ Skipped (no browser provider)`. ## Result reporting After testing, provide: ```md ## Builder Smoke Test Results **Date**: **Branch**: **Commit**: **Server**: scaffolded project @ localhost:4111 (`$PROJECT_DIR`) **Auth**: on / off / auto-skipped | # | Section | Status | Notes | | --- | ------------------ | -------- | ------------------------------- | | 1 | Setup | ✅/❌ | | | 2 | Workspace | ✅/❌ | | | 3 | Reconciliation | ✅/❌/⏭️ | | | 4 | Defaults | ✅/❌ | | | 5 | Model Policy | ✅/❌ | | | 6 | Skills | ✅/❌ | | | 7 | Registry | ✅/❌ | | | 8 | Agents | ✅/❌ | | | 9 | Pickers | ✅/❌ | | | 10 | Stars | ✅/❌ | | | 11 | Permissions / RBAC | ✅/❌ | | | 12 | Infrastructure | ✅/❌ | | | 13 | Channels | ✅/❌ | | | 14 | UI | ✅/❌/⏭️ | | | 15 | Auth | ✅/❌/⏭️ | (skipped if no WORKOS\_\* vars) | **Product issues**: (list any — server/UI behaved unexpectedly. For each: HTTP method + path or UI route, expected vs actual, one-sentence guess at the cause. Do not pre-decide "known bug" — log what the server actually did. Say "none" if empty.) **Skill issues**: (list any — the skill itself was wrong, unclear, stale, or unreachable. For each: which file + step (e.g. `references/skills.md` step F2), and what was wrong. Doc drift, not product bugs. Say "none" if empty.) **Verify before filing.** Before adding anything to either list, re-confirm against the live response in this run, not memory of an earlier call: - For any **shape mismatch / missing field / wrong key name** claim, paste the actual JSON fragment (or the relevant keys) directly under the bullet so the claim is reproducible. If the skill says `features.agent.skills` and the response has `features.agent.skills`, that is not a skill issue — names that look similar in passing (`featSkills`, `agent.features.skill`, etc.) are easy to misread. - For any **endpoint inconsistency** claim (e.g. "endpoint A returns X but B returns Y"), re-curl both endpoints fresh in the same run rather than reusing a stale response from earlier in the section. - For any **RBAC / authz** claim (403 where you expected 200, or vice versa), check `references/permissions.md` for the matrix _and_ check the "Design decisions" list in this file. Several roles intentionally share `*:read`, which means infra/list/get endpoints look "ungated" but are working as intended. Also confirm the cookie you sent belongs to the role you think it does (`curl -H "Cookie: $(cat /tmp/cookie.txt)" $BASE/auth/me | jq '.role // .roles'`). - For any **missing endpoint** claim (e.g. "agent avatar 404"), confirm the contract first — several flows are client-composed on top of generic CRUD (avatar = `PATCH metadata.avatarUrl`; Library Copy = `POST /stored/skills` with `metadata.origin`). The "Design decisions (don't file as bugs)" section enumerates the common ones. - If a claim can't be reproduced on a fresh request, drop it. **Regressions**: (list any behavioral changes from a previous run) **Warnings**: (e.g., dev-server crash on `/auth/refresh` polling, OPENAI_API_KEY required at startup) **Skipped sections**: (list with reason) ``` ## Known rough edges The branch has accumulated minor papercuts. Note these in your report only if you hit them; don't fail the run on them: - Don't `rm` `$PROJECT_DIR/mastra.db` by hand while the server is up — stop the server first, then delete. - Dev server can crash on hot-reload from `/auth/refresh` polling. Restart and continue. - `OPENAI_API_KEY` is required at startup — server won't boot without it, even if you only test non-LLM surfaces. - `mastra dev` overwrites `process.env` from `.env` at boot, so inline env overrides on the command line don't reach the server. Re-run scaffold to change `.env`. - The scaffold links against the **current worktree's** packages via `link:` overrides. If you switch worktrees, re-run scaffold so the symlinks point at the right tree. ## Design decisions (don't file as bugs) These have come up across multiple runs and are intentional. If you observe one, note it in your report as "expected behavior" — do **not** open a product issue. - **`GET /auth/me` without a cookie returns `200` with a `null`-ish body.** The route is mounted as a public route (`createPublicRoute`); the contract is "return the current user or `null`", not "401 if missing". A `401` here would break the public app shell. - **`/editor/builder/infrastructure` is readable by every default role (admin / member / viewer).** The handler gates on `infrastructure:read` and every default role has `*:read`, which matches by resource-wildcard. The page only exposes deployment-shape data (provider names, registered flags, configured/unconfigured booleans) — no secrets. - **Flipping a skill's `visibility` from `private` to `public` does not auto-publish unless the skill has a registered `skillPath`.** Visibility and publication are independent fields by design. A plain-create skill flipped public stays at `activeVersionId: null` until a real `POST /publish` runs against a source path. - **Zod schema validation runs before the permission middleware on `/stored/*` writes.** A malformed body from a viewer returns a 400, not a 403. This is standard request lifecycle; the response surface doesn't leak resource state. - **The role-impersonation picker only lists roles _different from the current one_.** Logged in as `admin`, you'll see `Member` and `Viewer` and nothing else — there is no `Admin` self-item. This is intentional (admin is the baseline; you're already there). - **Impersonation is UI-only.** The API still answers per the real logged-in role. A `curl` while impersonating `viewer` will still return the admin's response. - **`Favorites` sidebar entry links to `/agent-builder/favorite` (singular).** The plural `/favorites` is not a registered route and renders the React Router 404. Use the sidebar link or the singular URL when scripting. - **Avatar upload uses agent `PATCH` with `metadata.avatarUrl`, not a dedicated `/avatar` endpoint.** See `references/agents.md`. - **Copy is client-side.** There is no `POST /stored/skills/:id/copy`. The UI fetches the source skill and POSTs a new row to `/stored/skills` with `metadata.origin = "library-copy"`. See `references/registry.md`. ## Out of smoke-test scope Some flows are documented in `references/` but are not driven by the smoke-test agent because they require server-lifecycle gymnastics that don't fit a single run: - **Reconciliation steps 2/3/4/6** (`references/reconciliation.md`) require editing `$PROJECT_DIR/src/mastra/index.ts` (changing `basePath` / `workspaceId` / config), restarting `mastra dev` multiple times, and observing drift detection or orphan archival across restarts. The smoke-test agent runs only **Step 1** (fresh-startup persistence) and **Step 5** (non-builder workspaces untouched). Run the rest by hand when changing reconciliation code. - **Real role-swap testing** (logging in as multiple WorkOS users with different roles in the same run) is out of scope. The agent verifies whichever role the live `--role` user actually has, and additionally exercises the **UI-only role impersonation** flow under `--role admin` (see `references/ui.md`). ## References - `references/setup.md` — server health, builder settings sanity, baseline counts, builder workspace existence - `references/workspace.md` — workspace CRUD via API - `references/reconciliation.md` — config-driven workspace lifecycle (fresh, idempotent, drift, archival, backfill) - `references/defaults.md` — builder defaults applied at agent create (memory, workspace, browser, model) - `references/model-policy.md` — allowed list, default model, dropdown filtering, rejection - `references/skills.md` — skill CRUD, visibility, publish, filesystem writes, files array - `references/registry.md` — skills.sh browse/install, library Copy flow, origin badges, gating - `references/agents.md` — stored agent CRUD, skill attachment, model swap, delete-from-edit, avatar upload - `references/picker-allowlist.md` — tools/agents/workflows pickers respect allowlists - `references/favorites.md` — favorite/unfavorite agents and skills, idempotency (formerly `stars.md`) - `references/permissions.md` — viewer/member/admin/owner gating, role expectation matrix, UI impersonation, auth-off bypass - `references/infrastructure.md` — `/editor/builder/infrastructure` payload + UI - `references/channels.md` — Slack provider visibility, connectChannel tool - `references/ui.md` — browser checklist across Builder routes - `references/auth.md` — WorkOS on/off, 401 behavior, authorId, mode-toggle via `.env` - `scripts/scaffold.sh` — scaffold or refresh the hermetic project at `$PROJECT_DIR` - `scripts/preflight.sh` — wraps `scaffold.sh` + mode expectation (`--expect off|on`) - `scripts/wait-for-server.sh` — poll `:4111` until healthy