--- title: "@trailblaze/scripting — Authoring Vision & Roadmap (for TS authors)" type: decision date: 2026-04-22 --- # `@trailblaze/scripting` — Authoring Vision & Roadmap Companion devlog aimed at **TypeScript authors** who'll be writing custom Trailblaze tools. Where the companion devlogs focus on decisions (envelope shape, proto-vs-JSON, runtime mechanics), this one captures the **author experience** — what you write today, what you'll write tomorrow, and where we're headed. Review target: give TS-literate reviewers enough to push back on the API shape before we commit harder. ## The one-paragraph pitch Trailblaze is an AI-driven UI-testing framework where the LLM calls "tools" to drive the app (tap, type, assert, remember-this). Authors have always been able to add their own tools. Today that means writing Kotlin. We're adding a TypeScript authoring surface so anyone who can spin up an MCP server can ship custom tools — no Kotlin, no Gradle, no JVM round-trip. `@trailblaze/scripting` is a thin wrapper around `@modelcontextprotocol/sdk` that hides the protocol ceremony, exposes Trailblaze's device/session context as a typed object, and (landing in the next PR) lets your tool call back into Trailblaze's own primitives to compose higher-level behaviour. ## What authors write today (raw MCP SDK) Committed reference: `examples/android-sample-app/trails/config/mcp/tools.ts` ```typescript import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; const server = new McpServer( { name: "sample-app-mcp-tools", version: "0.1.0" }, { capabilities: { tools: {} } }, ); server.registerTool( "generateTestUser", { description: "...", inputSchema: {} }, async () => ({ content: [{ type: "text", text: JSON.stringify({ name: "Sam", email: "sam@example.com" }) }], isError: false, }), ); // ...repeat for every tool... const transport = new StdioServerTransport(); await server.connect(transport); ``` Four MCP imports, three pieces of server-construction ceremony, and you're on the hook for stdio transport wiring per file. More importantly: **the handler has no access to Trailblaze state.** Device platform, screen dimensions, agent memory, the session id — none of it is reachable from inside the tool. If you need them, you can dig them out of `process.env` (where we set a handful of `TRAILBLAZE_*` vars at spawn), but it's off-contract and inconvenient. ## What authors will write with `@trailblaze/scripting` Committed reference: `examples/android-sample-app/trails/config/mcp-sdk/tools.ts` ```typescript import { trailblaze } from "@trailblaze/scripting"; trailblaze.tool( "generateTestUser", { description: "..." }, async (_args, ctx) => { // ctx is TrailblazeContext | undefined // ctx.device.platform "ios" | "android" | "web" // ctx.device.driverType, .widthPixels, .heightPixels // ctx.memory agent memory as Record // ctx.sessionId opaque session id for log correlation // ctx.invocationId per-call id — forward on callbacks // ctx.baseUrl daemon URL for the callback channel return { content: [{ type: "text", text: JSON.stringify({ name: "Sam", email: "sam@example.com" }) }], }; }, ); await trailblaze.run(); ``` One import. One call per tool. `ctx` is typed. `await trailblaze.run()` replaces the server + transport boilerplate. The two reference files sit side-by-side in the sample app and expose the **same two tools under suffixed names** (`generateTestUser` / `generateTestUserSdk`) so CI exercises both authoring surfaces in parallel and we notice the moment either one drifts. ## The `TrailblazeContext` envelope Injected by the host on every `tools/call` under `_meta.trailblaze`. Shape is locked in the [envelope-migration devlog](2026-04-22-scripting-sdk-envelope-migration.md): ```typescript type TrailblazeContext = { baseUrl: string; // daemon HTTP base URL sessionId: string; // opaque; log correlation only invocationId: string; // per-tool-call; forward on callbacks device: { platform: "ios" | "android" | "web"; widthPixels: number; heightPixels: number; driverType: string; // e.g. "android-ondevice-accessibility" }; memory: Record; // string-valued today; typed `unknown` for forward-compat }; ``` `undefined` when the tool was invoked outside a Trailblaze session (ad-hoc MCP client, unit test) — your handler decides whether to degrade gracefully or refuse. There's also a **legacy arg envelope** (`_trailblazeContext` inside `arguments`) that pre-dates the SDK and still works; the raw-SDK example reads it. New tools should only read via `ctx`. Both envelopes are injected in parallel during the migration window. ## What the next landing unlocks This landing ships the authoring surface — tools declare themselves, register, and execute. They have typed access to context but can't *call back* into Trailblaze. A follow-up lights up that second half. The callback architecture inherits directly from two prior design threads: - **[Decision 029 — Custom Tool Architecture](2026-02-03-custom-tool-architecture.md)** specified the RPC path: proto-typed commands over HTTP, with authors reaching a Trailblaze daemon endpoint discovered via `baseUrl` / `trailblazeInvocationId` on the envelope. Our callback endpoint (`/scripting/callback`) is that endpoint, JSON-first instead of proto (see D2 in the [envelope-migration devlog](2026-04-22-scripting-sdk-envelope-migration.md#d2-callback-wire-format-json-via-kotlinxserialization)). - **[Synchronous Tool Execution from JS](2026-04-20-scripted-tools-a2-sync-execute.md)** specified the *semantics* — synchronous `trailblaze.execute(toolName, params)` returning a `TrailblazeToolResult`, reentrance, recording behaviour, error variants as JS objects. That work was originally aimed at QuickJS (in-process); we're taking the same shape and putting it on HTTP for the subprocess path. Same author-facing contract, different transport. The next PR's surface: ```typescript import { trailblaze } from "@trailblaze/scripting"; trailblaze.tool( "signUpNewUser", { description: "Creates a fresh account and signs in." }, async (_args, ctx, client) => { // client.callTool(name, args) hits the daemon's /scripting/callback endpoint, // which deserializes via toolRepo.toolCallToTrailblazeTool(name, argsJson) and // executes against the live session's agent. const user = await client.callTool("generateTestUser", {}); await client.callTool("tapOnElementWithText", { text: "Name field" }); await client.callTool("inputText", { text: user.name }); await client.callTool("tapOnElementWithText", { text: "Sign up" }); return { content: [{ type: "text", text: `Signed up as ${user.email}` }] }; }, ); await trailblaze.run(); ``` The Kotlin side is already in place: `/scripting/callback` validates the `invocationId`, resolves the live `TrailblazeToolRepo` + execution context, dispatches, returns the result as JSON. The callback landing is a pure TS change — no daemon changes, no proto. ### What the callback unlocks in practice Once a subprocess can call back into Trailblaze, it can do anything a Kotlin-authored tool can do — because it's dispatching the same tools through the same repo. The patterns this enables are described end-to-end in the [execution-model devlog](2026-04-20-scripted-tools-execution-model.md), which was originally written for the in-process QuickJS path but applies identically here: - **Query view hierarchy / visibility.** `await client.callTool("assertVisibleWithText", { text: "Login" })` and branch on `success: true/false`. Same information `trailblaze.isVisible(...)` would have surfaced in the QuickJS vision — no need for a separate query API because every Trailblaze tool's result is already the answer. - **Read agent memory** (already live via `ctx.memory`). Write via callback into `rememberText` or any Trailblaze tool that writes memory. - **Try-then-fallback composition.** Call a primitive, inspect `success`, branch. The original [Decision 025 vision](2026-02-20-scripted-tools-vision.md) was emit-only and couldn't do this; callbacks restore the observability that vision was missing. - **Polling / stabilization loops.** The [execution-model devlog](2026-04-20-scripted-tools-execution-model.md#who-actually-needs-this) uses the API-29 "wait for main screen + stay stable for N checks" pattern as the motivating example — 80 lines of Kotlin today, ~15 lines of TS with callbacks. - **Custom assertions.** Compose whatever branching logic you want on top of the primitive tools, still recorded for deterministic replay. The sync-execute devlog's [design concerns](2026-04-20-scripted-tools-a2-sync-execute.md) (reentrance caps, timeout discipline, recording semantics, thread safety) apply here too — flagged in the [envelope-migration devlog](2026-04-22-scripting-sdk-envelope-migration.md#callback-channel-design-concerns-flagged-for-the-callback-channel-landing) as things the callback landing has to pin down. ### Design question we'd love feedback on **Should the TS surface ever get typed command wrappers?** A future option is to generate `client.tap({ text: "..." })`, `client.inputText(...)`, etc. from the Kotlin tool catalog so authors get IDE completion instead of "call `callTool('tapOnElementWithText', ...)` and hope the name hasn't changed." We're leaning **no**: name + JSON args is the lowest-common-denominator surface, it evolves without forcing SDK re-publishes every time a tool changes, and it works identically for Python / on-device QuickJS consumers later. The original [Decision 038 execution-model devlog](2026-04-20-scripted-tools-execution-model.md#pr-b-typed-query-ergonomics-optional-polish) flagged PR B (typed query ergonomics) as "optional polish" for exactly this reason — everything expressible via `client.callTool("assertVisibleWithText", { text }).success === true` is already there; typed wrappers are about readability, not capability. The [earlier `commands.proto` prototype](2026-02-03-custom-tool-architecture.md) tried typed wrappers and what we'd be reviving is essentially that idea's codegen pipeline. If your team has a strong opinion here — ergonomics win vs. catalog drift risk — this is the load-bearing place to push back before that option lands. ## Where we're heading | Milestone | What | |----|------| | **This landing** | Authoring surface + callback endpoint. Tools declare, register, execute. No composition yet. | | **Callback landing** | `client.callTool(name, args)` on the TS side. Tools compose other Trailblaze tools. | | **Typed-commands decision** | Decision: typed commands or stay at `callTool(name, args)`. (Current lean: stay untyped.) | | Later | Python SDK with the same envelope shape. Same HTTP callback endpoint. | | Later still | On-device QuickJS bundle of the same `.ts` source — same authoring code, different transport. | The authoring surface and the envelope contract are deliberately **runtime-agnostic**: the same `tools.ts` should compile for both the host-subprocess mode (today) and the on-device QuickJS mode (later) without code changes. That constraint is why callbacks go over HTTP + JSON instead of MCP-as-transport — an MCP-over-stdio callback would be a different protocol than the HTTP-to-daemon callback QuickJS-on-device will need to issue. ## Authoring workflow today 1. **Install** — `cd /trails/config/mcp-sdk && bun install` (or `npm install`). The SDK is consumed today via a local `file:` link in the sample example; npm-registry publish is a follow-up. 2. **Author** — `tools.ts` in that directory, using `@trailblaze/scripting`. 3. **Wire** — add an entry under `mcp_servers:` in the target YAML: ```yaml mcp_servers: - script: ./mcp-sdk/tools.ts ``` 4. **Run** — Trailblaze spawns the file as a bun/node subprocess at session start, registers every tool it advertises, and includes them in the LLM's tool catalog for that session. No Kotlin, no Gradle, no JVM in your author loop. If bun isn't on your PATH, Trailblaze falls back to `node + tsx`. ## What we want from web-team review Specific questions: 1. **`trailblaze.tool()` API shape.** Is `(name, spec, handler)` the right shape, or do you prefer `trailblaze.tool({ name, description, handler })` as an options object? We picked positional to match `server.registerTool`. 2. **`ctx` as second handler arg, or options-bag?** Second arg was chosen for destructure-friendliness (`async (_args, { device, memory })`). 3. **Handler return type.** Today we mirror the MCP SDK's shape (`{ content: [{ type: "text", text }], isError? }`). Worth wrapping? Concern: every wrapper is another thing to maintain on upgrade. 4. **Typed commands.** Worth the codegen cost or stay at `callTool(name, args)`? 5. **Error handling.** Thrown errors → `isError: true` MCP result, or let the subprocess crash? Today the SDK lets the MCP SDK's default behavior stand. 6. **Memory writes.** Today memory is read-only from `ctx.memory`. Authors who want to remember something call back into a Trailblaze tool that writes memory. Ergonomically right, or should `ctx.memory` expose `set`? 7. **Publishing.** Private `file:` link today. What's the right publishing path — GitHub Package Registry? npm public with a `@trailblaze/` scope? A private registry? 8. **TypeScript tooling.** Sample tsconfig uses `Node16` module resolution + strict mode. Any standards we should align with up-front? ## References ### MCP-based tools (what lands how) - **Direction doc:** [Scripted Tools PR A3 — MCP SDK Subprocess Toolsets](2026-04-20-scripted-tools-mcp-subprocess.md) — the *what* (Option-2 amendment: subprocess MCP, not QuickJS-first) - **Scope for PR A3 (what shipped before this SDK landing):** [Host-Side Subprocess MCP Toolsets (Scope)](2026-04-20-scripted-tools-a3-host-subprocess.md) — `mcp_servers:` YAML, spawn, handshake, tool registration, env-var contract - **Subprocess lifecycle + registration details:** [Scripted Tools PR A3 Phase 1](2026-04-20-scripted-tools-a3-subprocess-impl.md) - **MCP conventions (`_meta["trailblaze/*"]` keys, result shape, naming):** [Scripted Tools — MCP Extension Conventions](2026-04-20-scripted-tools-mcp-conventions.md) - **Forward-looking integration patterns (Tier 1 first-party vs Tier 2 third-party servers, toolset-level `mcp_servers:`, metadata overlays):** [MCP Server Integration Patterns](2026-04-21-scripted-tools-mcp-integration-patterns.md) - **Toolset consolidation (how host-subprocess + on-device bundle split):** [Scripted Tools — Toolset Consolidation & Revised Sequencing](2026-04-20-scripted-tools-toolset-consolidation.md) ### SDK callback / JSON-RPC / proto (the "call back into Trailblaze" thread) - **Foundational RPC architecture (HTTP proto-JSON, `baseUrl` + `trailblazeInvocationId`, the pattern this SDK inherits):** [Decision 029: Custom Tool Architecture](2026-02-03-custom-tool-architecture.md) - **Envelope + callback contract (this landing's decisions — D1 dual-write envelope, D2 JSON-not-proto):** [Scripting SDK — Envelope Migration & Callback Transport](2026-04-22-scripting-sdk-envelope-migration.md) - **Synchronous execute semantics (originally for QuickJS; same contract we're delivering via HTTP):** [Scripted Tools PR A2 — Synchronous Tool Execution from JS](2026-04-20-scripted-tools-a2-sync-execute.md) ### Query view hierarchy / memory / execute-tools — "observe and react" patterns - **Execution-model master plan (scripted tools, `trailblaze.execute()`, typed queries PR B, reentrance + timeout design):** [Scripted Tools Execution Model (QuickJS + Synchronous Host Bridge)](2026-04-20-scripted-tools-execution-model.md) - **Original scripted-tools vision (Decision 025, where memory + emit started):** [Scripted Tools Vision](2026-02-20-scripted-tools-vision.md) - **On-device QuickJS bundle (how the same `.ts` source will run on-device later):** [Scripted Tools PR A5 — MCP Toolsets Bundled for On-Device](2026-04-20-scripted-tools-on-device-bundle.md) ### Complementary authoring paths - **YAML-defined tools (static composition; scripts complement, don't replace):** [Decision 037: YAML-Defined Tools](2026-04-20-yaml-defined-tools.md) ## Appendix: sample-app side-by-side Both files implement `generateTestUser` and `currentEpochMillis`. The SDK file suffixes them (`...Sdk`) so both paths register side-by-side in the session without colliding. Read them together to see the diff the SDK makes: - Raw MCP SDK: `examples/android-sample-app/trails/config/mcp/tools.ts` - SDK: `examples/android-sample-app/trails/config/mcp-sdk/tools.ts` CI exercises both end-to-end via `SampleAppMcpToolsTest` and `SampleAppMcpSdkToolsTest` so drift between the two authoring surfaces surfaces immediately.