--- name: conference-speaker-scraper description: > Extract speaker names, titles, companies, and bios from conference websites. Supports direct HTML scraping and Apify web scraper fallback for JS-heavy sites. Use for pre-event research and outreach targeting. --- # Conference Speaker Scraper Extract speaker names, titles, companies, and bios from conference website /speakers pages. Supports direct HTML scraping with multiple extraction strategies, plus Apify fallback for JS-heavy sites. ## Quick Start No API key needed for direct scraping mode. ```bash # Scrape speakers from a conference page python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \ --url "https://example.com/speakers" # Use Apify for JS-heavy sites python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \ --url "https://example.com/speakers" --mode apify # Custom conference name (otherwise inferred from URL) python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py \ --url "https://example.com/speakers" --conference "Sage Future 2026" # Output formats python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output json # default python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output csv python3 skills/conference-speaker-scraper/scripts/scrape_speakers.py --url URL --output summary ``` ## How It Works ### Direct Mode (default) Fetches the page HTML and tries multiple extraction strategies in order, using whichever returns the most results: 1. **Strategy A -- CSS class hints:** Looks for speaker cards with class names containing "speaker", "presenter", "faculty", "panelist", "team-member" 2. **Strategy B -- Heading + paragraph patterns:** Looks for repeated `

`/`

` + `

` structures 3. **Strategy C -- JSON-LD structured data:** Checks for `