--- name: agent-browser description: Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction. allowed-tools: Bash(agent-browser:*) --- # Browser Automation with agent-browser ## Core Workflow Every browser automation follows this pattern: 1. **Navigate**: `agent-browser open ` 2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`) 3. **Interact**: Use refs to click, fill, select 4. **Re-snapshot**: After navigation or DOM changes, get fresh refs ```bash agent-browser open https://example.com/form agent-browser snapshot -i # Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit" agent-browser fill @e1 "user@example.com" agent-browser fill @e2 "password123" agent-browser click @e3 agent-browser wait --load networkidle agent-browser snapshot -i # Check result ``` ## Essential Commands ```bash # Navigation agent-browser open # Navigate agent-browser close # Close browser # Snapshot agent-browser snapshot -i # Interactive elements with refs (recommended) agent-browser snapshot -i -C # Include cursor-interactive elements # Interaction (use @refs from snapshot) agent-browser click @e1 # Click agent-browser fill @e2 "text" # Clear and type agent-browser select @e1 "option" # Select dropdown agent-browser check @e1 # Checkbox agent-browser press Enter # Key press # Wait agent-browser wait @e1 # Wait for element agent-browser wait --load networkidle # Wait for network idle # Capture agent-browser screenshot # Screenshot agent-browser screenshot --full # Full page ``` ## Ref Lifecycle (Important) Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after clicking links/buttons that navigate, form submissions, or dynamic content loading. ## Deep-Dive Documentation | Reference | When to Use | |-----------|-------------| | [references/commands.md](references/commands.md) | Full command reference with all options | | [references/snapshot-refs.md](references/snapshot-refs.md) | Ref lifecycle, invalidation rules, troubleshooting | | [references/session-management.md](references/session-management.md) | Parallel sessions, state persistence, concurrent scraping | | [references/authentication.md](references/authentication.md) | Login flows, OAuth, 2FA handling, state reuse | | [references/video-recording.md](references/video-recording.md) | Recording workflows for debugging and documentation | | [references/proxy-support.md](references/proxy-support.md) | Proxy configuration, geo-testing, rotating proxies | | [references/common-patterns.md](references/common-patterns.md) | Form submission, auth, data extraction, parallel sessions, iOS simulator | | [references/semantic-locators.md](references/semantic-locators.md) | Alternatives to refs | | [references/javascript-evaluation.md](references/javascript-evaluation.md) | eval rules, --stdin/-b explanation | ## Ready-to-Use Templates | Template | Description | |----------|-------------| | [templates/form-automation.sh](templates/form-automation.sh) | Form filling with validation | | [templates/authenticated-session.sh](templates/authenticated-session.sh) | Login once, reuse state | | [templates/capture-workflow.sh](templates/capture-workflow.sh) | Content extraction with screenshots |