---
name: geo-state-report
description: Create a GEO (LLM visibility) tracking report using Bright Data datasets (mandatory) and output an HTML report with actions.
---

# GEO state report (Bright Data → HTML)

You help the user:
1) define prompts that matter for their business,  
2) collect results from major LLM experiences (ChatGPT / Perplexity / optionally Gemini),  
3) generate an HTML report with actions.

The Bright Data API is mandatory for automated collection.

## Inputs to collect

- Brand/site:
  - target domains (ex: `example.com`)
  - brand terms (ex: product name, founder name)
  - competitor domains (optional)
- Prompt list (2-10 to start)
- Country / language

## Required tools

- `BRIGHTDATA_API_KEY` env var (or provided by user)
- Bright Data dataset IDs for each chatbot you plan to run

### Dataset IDs (always the same)

```
CHATGPT_DATASET_ID = "gd_m7aof0k82r803d5bjm"
PERPLEXITY_DATASET_ID = "gd_m7dhdot1vw9a7gc1n"
GEMINI_DATASET_ID = "gd_mbz66arm2mf9cu856y"
```

**IMPORTANT**: Before running the script, ask the user which chatbots they want to run:
- ChatGPT
- Perplexity  
- Gemini

They can select one, two, or all three. Only pass the dataset IDs for the selected chatbots to the script.

### API Key Security Instructions

**Avoid exposing the BrightData API key in chat messages or code.**

1. **Check if key exists**: Before running the script, check if `BRIGHTDATA_API_KEY` is already set in the environment:
   ```bash
   # Check without exposing the value
   if [ -z "$BRIGHTDATA_API_KEY" ]; then echo "Not set"; else echo "Set"; fi
   ```

2. **If key is not set**: Ask the user to export it themselves with these instructions:
   - Go to https://brightdata.com and log in
   - Navigate to your account settings/API section
   - Generate a new API key if needed
   - Go to Terminal and run: `export BRIGHTDATA_API_KEY="your-key-here"`
   - Do NOT paste the key in chat - the user should run the export command themselves but if they do, just use it (it's too late lol)

3. **Never read or display the key**: If you need to verify it's set, only check if the variable exists (is non-empty), never echo or display its value.

If missing: stop and ask the user to set them using the export command above.

## Collection script (Python)

Use:
- `geo-state-report/scripts/brightdata-geo.py`

It:
- triggers datasets (part 1),
- polls until ready (part 2),
- downloads snapshots,
- saves results to `results.json` (HTML report is NOT generated by the script).

**Output structure**: All files are saved in a dated folder (`YYYY-MM-DD`) within the specified `--out-dir`:
- `{out-dir}/{YYYY-MM-DD}/results.json` - Complete results data
- `{out-dir}/{YYYY-MM-DD}/snapshots/{chatbot}.json` - Snapshot metadata per chatbot
- `{out-dir}/{YYYY-MM-DD}/raw/{chatbot}-{snapshot_id}.json` - Raw snapshot data
- `{out-dir}/{YYYY-MM-DD}/report.html` - HTML report (generated by AI, see below)

### Example run

```bash
# Ensure BRIGHTDATA_API_KEY is set (user should export it themselves)
python3 geo-state-report/scripts/brightdata-geo.py \
  --check-url "https://example.com" \
  --prompts-file prompts.txt \
  --chatgpt-dataset-id "gd_m7aof0k82r803d5bjm" \
  --perplexity-dataset-id "gd_m7dhdot1vw9a7gc1n" \
  --gemini-dataset-id "gd_mbz66arm2mf9cu856y" \
  --target-domains "example.com" \
  --brand-terms "Example,Example Product" \
  --out-dir ./geo-run

# Files will be saved in: ./geo-run/2025-01-15/ (or current date)
```

Note: Only include the dataset ID flags for chatbots the user selected (ChatGPT, Perplexity, and/or Gemini).

## Post-execution analysis

**After the script completes successfully**, you MUST:

1. Read the `results.json` file from the dated output folder
2. Analyze the data and provide initial conclusions, including:
   - Overall visibility summary (cited vs mentioned vs not visible)
   - Which chatbots perform best/worst for the brand
   - Key patterns across prompts (e.g., which prompts get cited, which don't)
   - Fan-out query insights (what related queries are being suggested)
   - Source breakdown insights (UGC vs YouTube vs web dominance)
   - Competitor mentions if any
   - Top priority actions based on the data

3. **Generate the HTML report** (`report.html`) in the same dated output folder:
   - Create a beautiful, customized HTML report based on the actual results.json data
   - Follow the customization guidelines in the "Output requirements" section below
   - Save it to `{out-dir}/{YYYY-MM-DD}/report.html`

4. Present these conclusions clearly and concisely to the user, and let them know the HTML report has been generated.

## Output requirements

### HTML Report Generation

**IMPORTANT**: The Python script does NOT generate the HTML report. **YOU (the AI) must generate `report.html` from scratch** after reading the `results.json` file. Create a customized, beautiful report based on:

- **Results data**: Analyze the actual results.json to highlight key insights, anomalies, and patterns
- **User expertise**: Adjust technical depth and explanations based on user's SEO knowledge level
- **Business context**: Tailor recommendations to their specific industry, stage, and goals
- **Actionability**: Make insights immediately actionable with clear next steps

**Report generation guidelines**:

1. **Read the results.json** after script execution to understand the actual data
2. **Generate the HTML report** by:
   - Adding contextual comments/insights directly in the HTML (use HTML comments or visible callout sections)
   - Highlighting the most important findings with visual emphasis
   - Customizing the "Actions" section with specific, prioritized recommendations based on the data
   - Adding data-driven insights that aren't obvious from raw numbers
   - Including fan-out query analysis with specific content opportunities
   - Adding competitor analysis if competitors are mentioned
   - Creating visual hierarchy to guide the reader's attention

3. **Design principles**:
   - **Beautiful**: Use modern, clean design with good typography, spacing, and color contrast
   - **Easy to read**: Clear sections, scannable layout, visual hierarchy
   - **Actionable**: Every insight should lead to a clear next step or decision

4. **Required elements**:
   - A single `report.html` (generated by AI, saved to the dated output folder)
   - Summary metrics per prompt (cited, first citation rank, mentioned, fan-out queries, sources breakdown)
   - A prioritized "actions" section tailored to the specific results
   - Footer must include `holly-and-stick.com`

Use guidance from `obsidian/GEO Playbook.md`:
- list prompts → track → wait → analyze fan-outs
- create missing content for fan-outs
- target UGC-dominant sources strategically