--- name: agent-e2e description: Discover, record, and execute E2E test cases using browser exploration. Use when exploring test scenarios, recording user flows, organizing test structure, or executing YAML-based tests. --- # E2E Test Discovery & Execution You're here to discover E2E test scenarios, record them as YAML, or execute existing test cases. ## Core Principle: Semantic Selectors **The problem**: Traditional selectors (testId, CSS classes) break when UI changes. **The solution**: Use **semantic selectors** based on accessibility properties. These match what users see. ```yaml # ❌ Fragile {{dataTestId: "submit-btn"}} # ✅ Stable {{role: button, name: "Submit"}} ``` --- ## Phase 1: Discovering Test Scenarios Don't mechanically scan code/docs/UI. Use **risk-driven discovery**. ### High-Risk Indicators When exploring a project, prioritize areas with these characteristics: | Indicator | Why It's Risky | Example | |-----------|----------------|---------| | **Multi-field forms** | Validation, edge cases, state | Registration, checkout | | **Multi-step flows** | State carries across steps | Wizards, onboarding | | **Authentication boundaries** | Permissions, sessions | Login, role-based access | | **External integrations** | Dependencies can fail | Payment, third-party APIs | | **Data mutation** | Side effects matter | Create, update, delete | ### Discovery Strategy 1. **Start from risk**, not from code structure 2. **Use multiple sources** (code, docs, browser) to verify each risk 3. **Stop when confident**, not when sources are exhausted Ask yourself: "What would break that users would notice?" Start there. --- ## Phase 2: Recording Test Cases ### The Snapshot-Refs Pattern ```bash # 1. Open page agent-browser open http://localhost:3000 --headed # 2. Snapshot to discover elements agent-browser snapshot --json # Returns: {"refs": {"e1": {"name": "Email", "role": "textbox"}, ...}} # 3. Interact using refs agent-browser fill @e1 "test@example.com" agent-browser click @e2 # 4. Re-snapshot after DOM changes (refs get reassigned!) agent-browser snapshot --json ``` **Critical**: Always re-snapshot after any action that changes DOM. Refs are temporary. ### Recording as YAML Store **semantic selectors**, not refs: ```yaml steps: - desc: Submit login form commands: - agent-browser fill {{role: textbox, name: "Email"}} "{{env.EMAIL}}" - agent-browser click {{role: button, name: "Sign In"}} expect: - url_contains: /dashboard ``` --- ## Phase 3: Organizing Test Cases Use **complexity gradient** to enable fast root-cause analysis: ``` foundations/ → flows/ → compositions/ (atomic ops) (journeys) (multi-session) login.yaml checkout.yaml two-user-chat.yaml logout.yaml onboarding.yaml order-and-fulfill.yaml navigate.yaml password-reset.yaml ``` ### Why This Structure When a composition fails: 1. Check which flow failed 2. Check which foundation in that flow failed 3. Fix the foundation → everything above it recovers **Foundations** = building blocks (high reuse, must be stable) **Flows** = complete user journeys (compose foundations) **Compositions** = multi-session/multi-user scenarios (compose flows) ### Deciding the Category Ask: "Can this be broken into smaller independent tests?" - **Yes** → It's a flow or composition - **No** → It's a foundation --- ## Phase 4: Executing Test Cases ### Intent-Driven Execution YAML describes **intent + constraints**. You (the agent) fill in **execution details**. ```yaml # YAML says WHAT and WHEN - desc: Close popup if it appears commands: - agent-browser click {{role: button, name: "Close"}} optional: true # Constraint: this can fail # You decide HOW # 1. Snapshot to check if button exists # 2. If exists → click # 3. If not → skip (because optional: true) # 4. Record your decision for debugging ``` ### Execution Rules 1. **For each step**: snapshot → match selector → execute → verify expect 2. **For optional steps**: skip if element not found, but record the decision 3. **For failures**: report which selector didn't match and what was found instead 4. **Always**: close browser when done ### Handling Complex Scenarios **Conditional steps** (optional: true): ```yaml - desc: Dismiss cookie banner commands: - agent-browser click {{role: button, name: /Accept|Close|×/}} optional: true ``` You try the selector. If not found, skip and continue. **Data extraction**: ```yaml - desc: Capture order ID from page extract: from: {{role: heading, name: /Order #\d+/}} pattern: "Order #(\\d+)" save_as: order_id ``` You find the element, extract the value, save it for later steps. **Multi-session** (compositions): ```yaml sessions: - name: alice lifecycle: persistent - name: bob lifecycle: persistent steps: - desc: Alice sends message session: alice commands: [...] - desc: Verify Bob received it session: bob expect: - element_visible: {{role: paragraph, name: /Hello/}} ``` You manage separate browser sessions and switch between them. --- ## YAML Format Reference Minimal structure: ```yaml id: case-name title: Human-readable title category: foundation # foundation | flow | composition description: | What this tests and why it matters. parameters: base_url: http://localhost:3000 steps: - desc: Step description commands: - agent-browser {{selector}} [args] expect: - condition: value optional: false # default exploration: status: validated # draft | explored | validated ``` See [FORMATS.md](FORMATS.md) for complete specification. --- ## Common Issues **Empty snapshot**: Page hasn't loaded or lacks accessibility. Wait longer or check framework. **Refs changed**: Expected. Always re-snapshot before interacting. **Multiple matches**: Use index: `{{role: button, name: "Delete", index: 0}}` **Static routing**: `/docs` may need `/docs.html`. Note in your test case. **Rate limiting (429)**: Public sites limit unauthenticated requests. Consider: - Adding delays between steps (`agent-browser wait 2000`) - Using authenticated sessions when available - Recording on local dev instances when possible --- ## Checklist When discovering: - [ ] Identified high-risk areas (forms, multi-step, auth, integrations) - [ ] Verified risks from multiple sources - [ ] Prioritized by user impact When recording: - [ ] Used semantic selectors, not refs - [ ] Re-snapshot after DOM changes - [ ] Added expect conditions for verification When organizing: - [ ] Categorized by complexity (foundation/flow/composition) - [ ] Foundations are atomic and reusable - [ ] Compositions reference flows, flows reference foundations When executing: - [ ] Followed intent, applied judgment for details - [ ] Recorded decisions for debugging - [ ] Closed all browser sessions