---
name: firecrawl
description: Uses Firecrawl to scrape web pages to clean markdown, search and scrape top results, crawl entire websites, or map a domain. Use when the user needs to scrape a URL, crawl a site, search the web and get page content, or discover/map URLs on a domain.
---

# Firecrawl

## When to Use

- Scrape a single page to clean markdown for LLMs or processing
- Search the web and scrape the top results (query → markdown)
- Crawl an entire website with limits and timeout
- Map a domain to discover/index URLs (search, sitemap options)

## Setup

API key: set `FIRECRAWL_API_KEY` in `.env` (or environment). Get a key at [firecrawl.dev](https://firecrawl.dev).

Project helper (recommended): use `firecrawl_tools.scrape_url`, `search_and_scrape`, `crawl_site`, `map_domain` — they read the key from env and return errors if unset.

Direct SDK (firecrawl-py v4):

```python
import os
from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key=os.getenv("FIRECRAWL_API_KEY"))
```

## Scrape One URL (v4: `scrape`)

Returns a Document (markdown, metadata).

```python
doc = app.scrape("https://example.com", only_main_content=True)
# doc has markdown/content and metadata
```

Or use project helper: `firecrawl_tools.scrape_url(url)`.

## Search and Scrape Top Results (v4: `search`)

```python
result = app.search("what is Firecrawl?", limit=5)
# result is SearchData (scraped content for top results)
```

Or: `firecrawl_tools.search_and_scrape(query, limit=5)`.

## Crawl a Website (v4: `crawl`)

Starts crawl and waits until done or timeout. Returns CrawlJob (status, data).

```python
job = app.crawl("https://example.com", limit=100, timeout=300)
# job.status, job.data
```

Or: `firecrawl_tools.crawl_site(start_url, limit=100, timeout=300)`.

To start without waiting, use `app.start_crawl(url, limit=...)` then `app.get_crawl_status(job_id)` to poll.

## Map a Domain (v4: `map`)

Discover URLs on a domain (optional search query, sitemap, limit).

```python
map_result = app.map("https://example.com", search="pricing", limit=50)
```

Or: `firecrawl_tools.map_domain(url, search=..., limit=...)`.

## CLI (Optional)

User can run locally:

```bash
npx -y firecrawl-cli@latest init --all --browser
```

After that, the CLI can scrape/crawl from the command line; the agent can suggest CLI commands when appropriate.

## Notes

- Prefer reading `FIRECRAWL_API_KEY` from environment; do not hardcode keys.
- For LLM extraction with a schema, use `app.extract` (v4) or see Firecrawl docs.