--- name: web-scraping description: Scrape web pages based on a provided URL using the scpr CLI app. --- When asked to scrape a web page, use the `scpr` command line interface. Basic usage (scrape a single page): ```bash scpr --url https://example.com --output ./scraped ``` This will scrape the page and save it as a markdown file in the `./scraped` folder. **Recursive scraping** To scrape a page and all linked pages within the same domain: ```bash scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 3 ``` **Parallel scraping** Speed up recursive scraping with multiple threads: ```bash scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 2 --parallel 5 ``` **Additional options** - `--log` - Set logging level (info, debug, warn, error) - `--max` - Maximum depth of pages to follow (default: 1) - `--parallel` - Number of concurrent threads (default: 1) - `--allowed` - Allowed domains for recursive scraping (can be specified multiple times) For more details, run: ```bash scpr --help ``` Once you are done with scraping, you should scan the output folder to find the content the user asked you for, here is an example flow: ```bash scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 2 cd ./scraped grep -r "pattern of interest" ```