--- name: geo-state-report description: Create a GEO (LLM visibility) tracking report using Bright Data datasets (mandatory) and output an HTML report with actions. --- # GEO state report (Bright Data → HTML) You help the user: 1) define prompts that matter for their business, 2) collect results from major LLM experiences (ChatGPT / Perplexity / optionally Gemini), 3) generate an HTML report with actions. The Bright Data API is mandatory for automated collection. ## Inputs to collect - Brand/site: - target domains (ex: `example.com`) - brand terms (ex: product name, founder name) - competitor domains (optional) - Prompt list (2-10 to start) - Country / language ## Required tools - `BRIGHTDATA_API_KEY` env var (or provided by user) - Bright Data dataset IDs for each chatbot you plan to run ### Dataset IDs (always the same) ``` CHATGPT_DATASET_ID = "gd_m7aof0k82r803d5bjm" PERPLEXITY_DATASET_ID = "gd_m7dhdot1vw9a7gc1n" GEMINI_DATASET_ID = "gd_mbz66arm2mf9cu856y" ``` **IMPORTANT**: Before running the script, ask the user which chatbots they want to run: - ChatGPT - Perplexity - Gemini They can select one, two, or all three. Only pass the dataset IDs for the selected chatbots to the script. ### API Key Security Instructions **Avoid exposing the BrightData API key in chat messages or code.** 1. **Check if key exists**: Before running the script, check if `BRIGHTDATA_API_KEY` is already set in the environment: ```bash # Check without exposing the value if [ -z "$BRIGHTDATA_API_KEY" ]; then echo "Not set"; else echo "Set"; fi ``` 2. **If key is not set**: Ask the user to export it themselves with these instructions: - Go to https://brightdata.com and log in - Navigate to your account settings/API section - Generate a new API key if needed - Go to Terminal and run: `export BRIGHTDATA_API_KEY="your-key-here"` - Do NOT paste the key in chat - the user should run the export command themselves but if they do, just use it (it's too late lol) 3. **Never read or display the key**: If you need to verify it's set, only check if the variable exists (is non-empty), never echo or display its value. If missing: stop and ask the user to set them using the export command above. ## Collection script (Python) Use: - `geo-state-report/scripts/brightdata-geo.py` It: - triggers datasets (part 1), - polls until ready (part 2), - downloads snapshots, - saves results to `results.json` (HTML report is NOT generated by the script). **Output structure**: All files are saved in a dated folder (`YYYY-MM-DD`) within the specified `--out-dir`: - `{out-dir}/{YYYY-MM-DD}/results.json` - Complete results data - `{out-dir}/{YYYY-MM-DD}/snapshots/{chatbot}.json` - Snapshot metadata per chatbot - `{out-dir}/{YYYY-MM-DD}/raw/{chatbot}-{snapshot_id}.json` - Raw snapshot data - `{out-dir}/{YYYY-MM-DD}/report.html` - HTML report (generated by AI, see below) ### Example run ```bash # Ensure BRIGHTDATA_API_KEY is set (user should export it themselves) python3 geo-state-report/scripts/brightdata-geo.py \ --check-url "https://example.com" \ --prompts-file prompts.txt \ --chatgpt-dataset-id "gd_m7aof0k82r803d5bjm" \ --perplexity-dataset-id "gd_m7dhdot1vw9a7gc1n" \ --gemini-dataset-id "gd_mbz66arm2mf9cu856y" \ --target-domains "example.com" \ --brand-terms "Example,Example Product" \ --out-dir ./geo-run # Files will be saved in: ./geo-run/2025-01-15/ (or current date) ``` Note: Only include the dataset ID flags for chatbots the user selected (ChatGPT, Perplexity, and/or Gemini). ## Post-execution analysis **After the script completes successfully**, you MUST: 1. Read the `results.json` file from the dated output folder 2. Analyze the data and provide initial conclusions, including: - Overall visibility summary (cited vs mentioned vs not visible) - Which chatbots perform best/worst for the brand - Key patterns across prompts (e.g., which prompts get cited, which don't) - Fan-out query insights (what related queries are being suggested) - Source breakdown insights (UGC vs YouTube vs web dominance) - Competitor mentions if any - Top priority actions based on the data 3. **Generate the HTML report** (`report.html`) in the same dated output folder: - Create a beautiful, customized HTML report based on the actual results.json data - Follow the customization guidelines in the "Output requirements" section below - Save it to `{out-dir}/{YYYY-MM-DD}/report.html` 4. Present these conclusions clearly and concisely to the user, and let them know the HTML report has been generated. ## Output requirements ### HTML Report Generation **IMPORTANT**: The Python script does NOT generate the HTML report. **YOU (the AI) must generate `report.html` from scratch** after reading the `results.json` file. Create a customized, beautiful report based on: - **Results data**: Analyze the actual results.json to highlight key insights, anomalies, and patterns - **User expertise**: Adjust technical depth and explanations based on user's SEO knowledge level - **Business context**: Tailor recommendations to their specific industry, stage, and goals - **Actionability**: Make insights immediately actionable with clear next steps **Report generation guidelines**: 1. **Read the results.json** after script execution to understand the actual data 2. **Generate the HTML report** by: - Adding contextual comments/insights directly in the HTML (use HTML comments or visible callout sections) - Highlighting the most important findings with visual emphasis - Customizing the "Actions" section with specific, prioritized recommendations based on the data - Adding data-driven insights that aren't obvious from raw numbers - Including fan-out query analysis with specific content opportunities - Adding competitor analysis if competitors are mentioned - Creating visual hierarchy to guide the reader's attention 3. **Design principles**: - **Beautiful**: Use modern, clean design with good typography, spacing, and color contrast - **Easy to read**: Clear sections, scannable layout, visual hierarchy - **Actionable**: Every insight should lead to a clear next step or decision 4. **Required elements**: - A single `report.html` (generated by AI, saved to the dated output folder) - Summary metrics per prompt (cited, first citation rank, mentioned, fan-out queries, sources breakdown) - A prioritized "actions" section tailored to the specific results - Footer must include `holly-and-stick.com` Use guidance from `obsidian/GEO Playbook.md`: - list prompts → track → wait → analyze fan-outs - create missing content for fan-outs - target UGC-dominant sources strategically