# Usage Guide ## Which Tool When? — Decision Tree ``` User asks about something external / current │ ├─→ web_search("...") │ │ │ ├─→ 1 relevant result? │ │ └─→ web_fetch(url) ← no interaction needed │ │ OR │ │ └─→ web_browse(url, actions) ← needs interaction │ │ │ └─→ 2–5 relevant results? │ ├─→ All need no interaction? │ │ └─→ web_batch_fetch(urls[]) ← parallel fetch │ └─→ Some need interaction? │ └─→ web_fetch (no-interaction ones) │ web_browse (interactive ones) ← sequential │ └─→ User provides a URL directly ├─→ No interaction needed / loads on first request? │ └─→ web_fetch(url) └─→ Needs clicking / scrolling / waiting? └─→ web_browse(url, actions) ``` --- ## Tool Comparison | | `web_fetch` | `web_browse` | `web_batch_fetch` | |--|-------------|--------------|-------------------| | **Pages** | 1 | 1 | 1–15 (2–5 recommended) | | **Browser** | Yes (Scrapling) | Yes (agent-browser) | Yes (Scrapling) | | **Interaction** | ❌ No | ✅ Click, fill, scroll, wait | ❌ No | | **Selector** | ✅ Per-URL | ✅ Final state | ✅ Applied to all | | **Stealthy** | ✅ Yes | ❌ No | ✅ Yes | | **Speed** | Fast | Slower (browser ops) | Medium (parallel) | | **Best for** | Articles, docs, blogs | SPAs, forms, pagination | Research synthesis | `web_fetch` falls back to HTTP GET after a normal browser fetch fails, but not in stealthy mode. `web_batch_fetch` falls back to GET after failed browser fetches in all modes. --- ## Firecrawl Keyless fallback When a local backend cannot do the job, the tools automatically retry through **Firecrawl Keyless** (1,000 free credits/month, no API key, no signup) before giving up. It is **fallback-only** — never the primary path — and is **opt-out-able** with `PI_WEB_FIRECRAWL_FALLBACK=0`. Requires the optional `firecrawl-cli` (`npm install -g firecrawl-cli`); if it is absent the tools simply surface the original local error. Agents should call `web_search`/`web_fetch`/`web_browse` first and call `firecrawl_*` directly only after the corresponding local-first tool failed, or when the user explicitly asks for Firecrawl/cloud behavior. | Tool | Falls back to Firecrawl when… | |------|-------------------------------| | `web_search` | SearXNG errors out **or** returns zero results | | `web_fetch` | scrapling (incl. its HTTP-GET fallback) fails — anti-bot, heavy JS, PDFs | | `web_browse` | agent-browser is missing or its batch fails (not on caller validation errors) | | `web_batch_fetch` | (no fallback — Firecrawl batch scrape is not keyless) | The three `firecrawl_*` tools are fallback-only explicit escape hatches for capabilities the local backends lack (`github`/`research`/`pdf` search categories, cloud rendering, natural-language interaction). They are not the first step for ordinary URL reading; `web_fetch` already performs Firecrawl fallback internally when local fetching fails. **Graceful skip.** If the fallback itself cannot help — the CLI is missing, the IP is flagged as suspicious, the keyless quota is exhausted, or the fallback is disabled — the tool falls through to the original local-tool error so the user is never left worse off. **Credit budgeting.** Search ≈ 2 credits / 10 results, scrape ≈ 1 credit / page, interact ≈ 2 credits/min (code-only) or ≈ 7 credits/min (AI prompt). Results report `creditsUsed` where the source provides it. The fallback stays conservative (small limits) against the 1,000 credits/month allowance. **Privacy.** Firecrawl is a cloud service: when the fallback runs, the URL/query and page content leave the machine. Set `PI_WEB_FIRECRAWL_FALLBACK=0` to enforce a strict local-only, no-cloud-egress policy. The fallback is **keyless-only** — it never reads, stores, or sends an API key, and spawns the CLI under an isolated temporary `HOME`. ---