---
name: nimble-agents
argument-hint: "[query or URL]"
description: >
  Finds, generates, and runs agents to extract structured data from
  websites at scale. Handles multi-source with unified normalized schemas,
  composing batch pipelines, and their SDK code generation with structured output.
  Use when user asks to "get data from a website", "scrape a website",
  "compare data points across websites", "generate a web scraper",
  or mentions Nimble.
allowed-tools:
  - mcp__nimble-mcp-server__nimble_agents_list
  - mcp__nimble-mcp-server__nimble_agents_get
  - mcp__nimble-mcp-server__nimble_agents_generate
  - mcp__nimble-mcp-server__nimble_agents_generate_status
  - mcp__nimble-mcp-server__nimble_agents_run
  - mcp__nimble-mcp-server__nimble_agents_publish
  - mcp__nimble-mcp-server__nimble_web_search
disable-model-invocation: false
license: MIT
metadata:
  version: "0.4.0"
  author: Nimbleway
  repository: https://github.com/Nimbleway/agent-skills
---

# Nimble Agents

Structured web data extraction via Nimble agents. Always finish with executed results or runnable code.

User request: $ARGUMENTS

## Prerequisites

Ensure the Nimble MCP server is connected:

**Claude Code:**
```bash
export NIMBLE_API_KEY="your_api_key"
claude mcp add --transport http nimble-mcp-server https://mcp.nimbleway.com/mcp \
  --header "Authorization: Bearer ${NIMBLE_API_KEY}"
```

**VS Code (Copilot / Continue):**
```json
{
  "nimble-mcp-server": {
    "command": "npx",
    "args": ["-y", "mcp-remote@latest", "https://mcp.nimbleway.com/mcp",
             "--header", "Authorization:Bearer YOUR_API_KEY"]
  }
}
```

**Get an API key:** [online.nimbleway.com/signup](https://online.nimbleway.com/signup) → Account Settings → API Keys

## Core principles

- **Fastest path to data.** The default route is: discover agent → get schema → run → display results. Planning (Step 1P), codegen (Step 3B), and generation (Step 3C) are **escalation paths** — enter them only when signals explicitly require it. Most requests resolve by finding and running an existing agent.
- **Infer, don't ask.** Only use `AskUserQuestion` when there is genuine ambiguity that cannot be resolved from context.
- **AskUserQuestion for all choices.** Never present choices as plain numbered lists in markdown. AskUserQuestion provides interactive arrow-key selection. Constraints: 2–4 options, header max 12 chars, label 1–5 words. "Other" is added automatically. Recommended option goes first with "(Recommended)" appended.
- **Keep output concise.** Present results and options. No commentary about implementation choices, architecture, or performance.
- **Schema before run — always.** Call `nimble_agents_get` before `nimble_agents_run` to understand input/output fields. **This applies every time an agent is run, including when pivoting to a fallback agent after errors.** When switching agents, always repeat the full cycle: `nimble_agents_get` → present schema → confirm → run. Present input parameters (name, required, type, example) and key output fields in a markdown table so the user knows what to expect.
- **Verify response shape before codegen.** Check the `skills` (output fields) and `entity_type` from `nimble_agents_get` to determine the correct REST API response nesting. See **`references/agent-api-reference.md`** > "Response shape inference" and **`references/sdk-patterns.md`** > "Response structure verification".
- **Web search for disambiguation.** When the target domain is unfamiliar or no agent clearly matches, use `nimble_web_search` to explore what data exists before committing to an agent approach. `nimble_web_search` is the preferred tool for all information-finding tasks (research, reviews, general search).
- **`google_search` is not a general search tool.** It is a SERP analysis agent — use it only when the user's *intent* is to analyze Google's search results page itself (e.g., rank/position tracking, SEO competitive analysis, SERP feature monitoring). Before considering `google_search`, all other options must be exhausted: dedicated agents via `nimble_agents_list`, `nimble_web_search`, and agent generation. If the goal is "find information about X", use `nimble_web_search`. If the goal is "where does X rank on Google for keyword Y", use `google_search`. See **`references/error-recovery.md`** for the full fallback hierarchy.

## Response shapes

| Layer | Path | Shape | When used |
|-------|------|-------|-----------|
| MCP tool (`nimble_agents_run`) | `data.results` | Always array | Interactive path (Steps 3A, 3C) |
| REST API — ecommerce SERP | `data.parsing` | `list` (array of records) | Codegen path (Step 3B) |
| REST API — non-ecommerce SERP | `data.parsing.entities.{EntityType}` | `dict` with nested arrays | Codegen: `google_search`, `google_maps_search`, etc. |
| REST API — PDP | `data.parsing` | `dict` (flat fields) | Codegen path (Step 3B) |

Always check `typeof`/`isinstance` before iterating REST responses. **Before generating code**, inspect the `skills` output from `nimble_agents_get` to determine which shape applies — see **`references/sdk-patterns.md`** > "Response structure verification".

## Step 1: Parse intent and route

From `$ARGUMENTS`, extract all signals at once:

**What to detect:**

| Signal | Values | How to detect |
|--------|--------|---------------|
| **Clarity** | `clear` (default) or `needs-planning` | See planning gate below |
| **Execution mode** | `interactive` (default) or `codegen` | See routing table below |
| **Scale** | `small` (≤50 results) or `large` (>50) | Numbers, "all", "top 1000", "bulk", "batch" |
| **Output format** | `display`, `csv`, `json`, `file` | "CSV", "JSON file", "save to", "spreadsheet", "export" |
| **Stores** | list of store names | "amazon", "walmart", "across X and Y", "compare", "both" |
| **Target type** | `search` (keyword) or `detail` (URL/ID) | Keywords → SERP agent; specific URLs/ASINs → PDP agent |
| **Language** | inferred from project | Check codebase (see language inference below) |

### Planning gate (most requests skip this)

**Default: proceed directly to Step 2.** Only route to plan mode when ALL of these are absent:
- A specific URL, site, or domain
- Clear or inferable data to extract
- A single well-scoped task

In other words, plan mode is for vague or multi-step requests like "build me a scraping pipeline" or "I need competitive intelligence" where the target, data fields, or structure are genuinely unclear.

When `needs-planning`, go to **Step 1P** below. Otherwise skip to **Step 2**.

### Step 1P: Plan mode (unclear intent)

Follow the planning protocol in **`references/planning-workflow.md`**:

1. **Clarify** — use `AskUserQuestion` to resolve critical unknowns (max 2 questions at once).
2. **Explore** — call `nimble_agents_list` for each target. Use `nimble_web_search` for unfamiliar domains.
3. **Present plan** — show a gap analysis table (Site / Agent / Status: Existing or Generate) and confirm.
4. **Execute** — Step 2 for existing agents, Step 3C for generations (in parallel as background tasks).

### Execution mode routing (default: interactive)

**Route to codegen only when ANY of these are true:**
- Scale > ~50 results (pagination needed)
- Output format is file-based (CSV, JSON file, etc.)
- Multi-store comparison with merging
- Batch input (file of URLs/IDs)
- User explicitly asks for a script/code

**Otherwise route to interactive** (MCP tool calls).

### Language inference (for codegen path)

Check the project for language signals — do NOT ask unless ambiguous:

| Project file | Inferred language |
|-------------|-------------------|
| `pyproject.toml`, `requirements.txt`, `setup.py`, `*.py` files | Python |
| `package.json`, `tsconfig.json` | TypeScript/Node |
| `go.mod` | Go (REST API) |
| `Gemfile`, `*.rb` files | Ruby (REST API) |
| `Cargo.toml` | Rust (REST API) |
| None of the above | Default to Python |

**Only ask via AskUserQuestion if** both Python and Node project files exist simultaneously, or the user's codebase gives conflicting signals.

### Multi-agent detection

If the request mentions multiple stores ("compare across Amazon and Walmart", "both", "vs"), plan multi-agent orchestration upfront — search for agents for ALL stores in parallel, not sequentially.

## Step 2: Agent discovery

Call `nimble_agents_list` with **short, general keywords** (1–2 words). For multi-store requests, search for each store in parallel. If `count` exceeds `curr_count` in the response, paginate using `skip` to see more agents. Present results 5 at a time.

**How to present results depends on ambiguity:**

| Situation | Action |
|-----------|--------|
| Exactly 1 matching agent | Narrate: "Found `agent_name` — matches your request." Auto-advance. |
| 2+ plausible matches | Show table + `AskUserQuestion` with top 2 agents + "Generate new agent" |
| 0 matches | Use `nimble_web_search` to explore the target domain first (see `error-recovery.md` > "Ambiguous agent match"), then auto-advance to the generate path. `google_search` is not a fallback for missing agents — it is only for SERP analysis tasks (rank tracking, SEO). |
| Codegen path + clear match | Narrate agent choice silently. No need to ask — user will review the code. |

When presenting search results, show a markdown table of top 5, then use AskUserQuestion only if the choice between agents is genuinely ambiguous.

## Step 3A: Interactive path (small scale, display output)

**3A-1.** Call `nimble_agents_get` on the chosen agent. Present the schema clearly in markdown tables:
- **Input parameters:** Show each `input_properties` entry with name, required (yes/no), type, description, and example value.
- **Output fields:** Show key fields from the `skills` dict with name and type, so the user knows what data to expect.

See **`references/input-schema-guide.md`** for the full `input_properties` format and mapping rules.

**3A-2.** When intent is unambiguous (single matching agent, clear parameters, user provided the URL/query), **auto-advance directly to run** — skip confirmation. Otherwise, use `AskUserQuestion` to confirm:

```
question: "Run this agent?"
header: "Confirm"
options:
  - label: "Run agent (Recommended)"
    description: "Execute {agent_name} with inferred parameters"
  - label: "Generate new agent"
    description: "Create a custom agent instead"
```

When confirming, do NOT call `nimble_agents_run` in the same response as `AskUserQuestion`.

**3A-3.** Call `nimble_agents_run`. Present results as markdown table. **Auto-advance to Step 4 (final summary)** when the original request is fully satisfied. Only use `AskUserQuestion` for next steps when there is a clear reason to offer follow-up:

```
question: "What next?"
header: "Next step"
options:
  - label: "Done"
    description: "Finish with these results"
  - label: "Run again"
    description: "Re-run with different parameters"
  - label: "Get code"
    description: "Generate a script to reproduce this"
```

## Step 3B: Codegen path (large scale, file output, multi-store)

**3B-1.** Call `nimble_agents_get` on chosen agent(s). Inspect both `input_properties` and `skills` (output fields). Use `skills` to determine the correct response parsing structure — see "Response shapes" table above. Do NOT present schemas interactively — use them to inform code generation.

**3B-2.** Infer language from project context (see language inference table above).

**3B-3.** Generate a ready-to-run script. Consult **`references/sdk-patterns.md`** for correct patterns.

Script requirements:
- For the inferred language, use the appropriate SDK or REST API
- **Smoke test first:** Every batch script MUST validate a single query (submit → poll → fetch → verify data) before launching the full batch. Abort if the smoke test fails or returns empty. See **`references/sdk-patterns.md`** > "Smoke test".
- **Progress reporting:** Print a compact single-line status after each poll cycle: elapsed time, done/total, results count, in-flight count. Use `flush=True` or `PYTHONUNBUFFERED=1` for background scripts.
- Handle pagination for large result sets
- For multi-store: normalize fields per **`references/normalization-guide.md`**
- For CSV/file output: write results to the requested format
- For deduplication: deduplicate by (store, product_name) or equivalent — see normalization guide
- For large pipelines (50+ jobs) with an **append-friendly output format** (CSV, JSONL, or Parquet): use incremental file writes for crash resilience — see **`references/sdk-patterns.md`** > "Incremental File Writes". JSON arrays are NOT append-friendly — buffer in memory and write at the end for JSON output.

**Python:** Use `nimble_python` SDK with `uv run` inline metadata. Choose the right template based on job count — see the routing table in **`references/sdk-patterns.md`** (section: "When to use async vs sync").

**TypeScript/Node, curl, other languages:** Use the REST API directly. See **`references/rest-api-patterns.md`** for patterns and examples.

**3B-4.** Present the generated code and use `AskUserQuestion`:

```
question: "Run this script now?"
header: "Execute"
options:
  - label: "Run now (Recommended)"
    description: "Execute the script and show results"
  - label: "Save only"
    description: "Save the file without running"
```

## Step 3C: Generate path (no existing agent matches)

**3C-1.** Create a stable `session_id` (UUID v4, reuse for all generate/publish calls).

**3C-2.** Call `nimble_agents_generate` with a clear prompt. If the user specifies exact output fields (e.g., "extract name, price, and rating"), include an `output_schema` in the generate call to guide the agent's extraction. Handle status:
- `"waiting"` — present follow-up questions via `AskUserQuestion`, call generate again with same `session_id` and user's answer as `prompt`.
- `"processing"` — launch a **background polling task** (see below).
- `"complete"` — auto-advance to run and publish.
- `"error"` — analyze the error. If retryable (timeout, transient failure), try generating again with an improved prompt. Otherwise present error and offer alternatives.

### Background polling protocol (`processing` status)

Agent generation takes 2–10 minutes. Launch a **background Task agent** to poll with `nimble_agents_generate_status` every 30 seconds (max 20 checks). The conversation stays responsive while the agent polls.

Tell the user: "Agent generation started — this typically takes 2–5 minutes (up to 10). I'll check progress in the background."

See **`references/generate-and-publish.md`** > "Status: processing" for the exact Task prompt template and outcome handling (what to do on `complete`, `waiting`, `error`, or timeout).

When generating multiple agents, launch background tasks **in parallel** — one per session_id.

**3C-3.** Route to Step 3A (interactive) or Step 3B (codegen) to run the agent first based on the execution mode determined in Step 1.

**3C-4.** After a successful run, use `AskUserQuestion` to offer publishing. If confirmed, call `nimble_agents_publish` with same `session_id`. If 409, already published — proceed.

## Step 4: Final response

End with a concise summary table:

| Field | Value |
|-------|-------|
| Agent(s) used | `agent_name` |
| Source | Existing / Generated |
| Records extracted | count |
| Output | Displayed / `filename.csv` |

Include the extraction results (or top N if large).

## Documentation & troubleshooting

When encountering errors or need grounding, consult in order:

1. **`references/sdk-patterns.md`** — correct SDK patterns and common mistakes.
2. **https://docs.nimbleway.com/llms-full.txt** — full prose docs.
3. **https://docs.nimbleway.com/openapi.json** — API contract.
4. **Context7** (if available) — query `nimbleway`.

## Error recovery

When errors occur or additional grounding is needed, consult **`references/error-recovery.md`** for handling patterns, including:
- **Persistent data source failures** — when to stop retrying and pivot to `nimble_web_search` or agent generation. `google_search` is only for SERP analysis intent (rank tracking, SEO).
- **Ambiguous agent match** — using `nimble_web_search` to explore unfamiliar domains before generating custom agents.

## Additional references

Load reference files proactively during code generation. For the codegen path, always consult `references/sdk-patterns.md` (Python) or `references/rest-api-patterns.md` (other languages) before generating code. For error recovery, consult `references/error-recovery.md`. Load other references as needed.

- **`references/sdk-patterns.md`** — Python SDK: running agents, async endpoint, batch pipelines, incremental file writes (CSV/JSONL/Parquet).
- **`references/input-schema-guide.md`** — Mapping agent input schemas to params.
- **`references/agent-api-reference.md`** — Reference for all six MCP tools (including `nimble_agents_generate_status`).
- **`references/error-recovery.md`** — Error handling and recovery patterns.
- **`references/normalization-guide.md`** — Multi-agent field mapping, unified schema, deduplication.
- **`references/find-and-run-agent.md`** — Existing-agent path walkthrough.
- **`references/planning-workflow.md`** — Plan mode protocol for unclear/complex intents.
- **`references/generate-and-publish.md`** — Generate fallback walkthrough (includes polling protocol and outcome handling).
- **`references/bulk-extraction.md`** — Multi-URL batch extraction walkthrough.
- **`references/rest-api-patterns.md`** — REST API patterns for TypeScript, Node, curl, and other non-Python languages.
- **`references/codegen-walkthrough.md`** — Codegen path walkthrough: multi-store comparison with CSV output.

## Guardrails

- Agent workflows only — list, get, generate, generate_status, run, publish. No scheduling or monitoring.
- To modify an existing agent, generate a new one with an improved prompt — there is no update operation.
- **Never ask for information already in the request or inferable from context.**
- Present tool call results in markdown tables. Never show raw JSON.
- Adapt table columns to match actual data returned.