---
name: exa
description: Search the web and scrape page contents using Exa API. Find URLs via neural/keyword search, then extract clean text, highlights, and summaries. Use when the user wants to search the web, find pages, scrape content, extract text, or fetch articles. Triggers on "search for", "find pages about", "scrape", "extract content", "get page text", "fetch article", or "pull content from URL".
---

# Exa

Search the web and scrape pages using the Exa API. Neural search finds relevant URLs; scrape extracts clean content. Pay-per-use at $0.001/result — no subscription.

**Setup:** Set `EXA_API_KEY` environment variable with your Exa API key.

**Fallback:** If Exa scrape returns thin/incomplete content (especially paywalled journals like NEJM or JS-heavy SPAs like ClinicalTrials.gov), use the `firecrawl` skill instead.

## Commands

Run all commands from this skill's base directory (shown above).

### Search the web

```bash
uv run python cli.py search "insolvency case law Australia"
```

### Search with filters

```bash
uv run python cli.py search "voidable transactions" --category news -n 20
uv run python cli.py search "s588FF" --include-domains jade.io,austlii.edu.au
uv run python cli.py search "bankruptcy reform" --start-date 2024-01-01
```

### Search and fetch content in one call

```bash
uv run python cli.py search "liquidator duties" --text
uv run python cli.py search "preference claims" --text --highlights
uv run python cli.py search "public examination" --summary "Extract the key legal ruling"
```

### Search dry run

```bash
uv run python cli.py search --dry-run "insolvency law" -n 20
```

### Scrape page text

```bash
uv run python cli.py scrape "https://example.com/page"
```

### Scrape multiple URLs

```bash
uv run python cli.py scrape "https://url1.com" "https://url2.com" "https://url3.com"
```

### Get highlights (key snippets)

```bash
uv run python cli.py scrape "https://example.com" --highlights
uv run python cli.py scrape "https://example.com" --highlights --query "key financial details"
```

### Get structured summary

```bash
uv run python cli.py scrape "https://example.com" --summary "Extract the main argument"
```

### Limit text length

```bash
uv run python cli.py scrape "https://example.com" --max-chars 5000
```

### Force fresh crawl (bypass cache)

```bash
uv run python cli.py scrape "https://example.com" --fresh
```

### Batch scrape from file

```bash
uv run python cli.py batch urls.txt
uv run python cli.py batch urls.txt --highlights --query "case law details"
uv run python cli.py batch urls.txt --summary "Summarize the legal ruling"
```

### Dry run (preview cost)

```bash
uv run python cli.py scrape --dry-run "https://url1.com" "https://url2.com"
```

---

## Auto-Save

All results are automatically saved to `results/` within this skill's directory.

Filenames: `scrape_{domain}_{timestamp}.json`

---

## Output

Returns JSON with full page content:

```json
{
  "results": [
    {
      "url": "https://example.com/page",
      "title": "Page Title",
      "author": "Author Name",
      "publishedDate": "2024-01-15",
      "text": "Full page text content...",
      "highlights": ["Key snippet 1", "Key snippet 2"],
      "summary": "Structured summary if requested"
    }
  ],
  "cost": 0.001,
  "pages": 1
}
```

---

## Cost Reference

| Feature | Cost per result |
|---------|----------------|
| Search (URLs only) | $0.001 |
| Text content | $0.001 |
| Highlights | +$0.001 |
| Summary | +$0.001 |
| Search + Text + Highlights + Summary | $0.004 |

**Examples:**
- Search 10 results, URLs only: $0.01
- Search 10 results + text: $0.02
- Scrape 100 pages, text only: $0.10
- Scrape 50 pages, text + highlights: $0.10

---

## Search Options

| Option | Description |
|--------|-------------|
| `-n` / `--num-results` | Number of results (default: 10, max: 100) |
| `--type` | Search type: `auto`, `neural`, `keyword` |
| `--category` | Focus: `company`, `research paper`, `news`, `tweet`, `personal site`, `financial report`, `people` |
| `--include-domains` | Comma-separated domains to include |
| `--exclude-domains` | Comma-separated domains to exclude |
| `--start-date` | Min publish date (YYYY-MM-DD) |
| `--end-date` | Max publish date (YYYY-MM-DD) |
| `--include-text` | Comma-separated strings that must appear in results |
| `--exclude-text` | Comma-separated strings that must not appear in results |
| `--text` | Also fetch page text content |
| `--highlights` | Also fetch key snippets |
| `--summary` | Also fetch summary with this prompt |
| `--max-chars` | Limit text length per result |
| `--dry-run` | Preview cost before executing |
| `--output` / `-o` | Save results to specific file |

## Scrape Options

| Option | Description |
|--------|-------------|
| `--dry-run` | Preview cost before executing |
| `--highlights` | Extract key snippets from each page |
| `--query` | Guide highlight/summary extraction (e.g., "financial details") |
| `--summary` | Get a summary with this prompt |
| `--max-chars` | Limit text length per page |
| `--fresh` | Force live crawl, bypass cache |
| `--output` / `-o` | Save results to specific file |
| `--no-text` | Exclude full text (useful with --highlights or --summary) |