--- name: opencli-browser description: Use when an agent needs to drive a real Chrome window via opencli — inspect a page, fill forms, click through logged-in flows, or extract data ad-hoc. Covers the selector-first target contract, compound form fields, stale-ref handling, network capture, and the agent-native envelopes the CLI returns. Not for writing adapters — see opencli-adapter-author for that. allowed-tools: Bash(opencli:*), Read, Edit, Write --- # opencli-browser The first reader of this CLI is an agent, not a human. Every subcommand returns a structured envelope that tells you exactly what matched, how confident the match is, and what to do if it didn't. Lean on those envelopes — do not guess. This skill is for **driving a live browser** to accomplish an agent task. If you are building a reusable adapter under `~/.opencli/clis//` use `opencli-adapter-author` instead. --- ## Prerequisites ```bash opencli doctor ``` Until `doctor` is green, nothing else will work. Typical failures: Chrome not running, extension not installed, debug port blocked by 1Password / other extensions. The doctor output tells you which. --- ## Window lifecycle - `opencli browser *` commands already keep the automation session alive between calls. The window stays open until you run `opencli browser close` or the idle timeout expires. - `--focus` (or `OPENCLI_WINDOW_FOCUSED=1`) opens the automation window in the foreground. Use it when you want to watch the page live. - `--live` (or `OPENCLI_LIVE=1`) is mainly for browser-backed adapter commands such as `opencli xiaohongshu note ...`. It keeps the adapter's automation window open after the command returns so you can inspect the final page state. --- ## Mental model 1. **Selector-first target contract.** Every interaction command (`click`, `type`, `select`, `get text/value/attributes`) takes one ``, which is *either* a numeric ref from `state`/`find` *or* a CSS selector. Use `--nth ` to disambiguate multiple CSS matches. 2. **Every envelope reports `matches_n` and `match_level`.** `match_level` is `exact`, `stable`, or `reidentified` — the CLI already rescued moderate DOM drift for you, but the level tells you how confident to be. 3. **Compact output first, full payload on demand.** `state` is a budget-aware snapshot; `get html --as json` supports `--depth/--children-max/--text-max`; `network` returns shape previews and you re-fetch a single body with `--detail `. If you emit a giant payload you are burning context you did not need to burn. 4. **Structured errors are machine-readable.** On failure the CLI emits `{error: {code, message, hint?, candidates?}}`. Branch on `code`, not on message strings. --- ## Critical rules 1. **Always inspect before you act.** Run `state` or `find` first. Never hard-code a ref or selector from memory across sessions — indices are per-snapshot. 2. **Prefer numeric ref over CSS once you have it.** Numeric refs survive mild DOM shifts because the CLI fingerprints each tagged element. A CSS selector written by hand will break the first time the site re-renders. 3. **Read `match_level` after every write.** `exact` = all good. `stable` = the element is the same but some soft attrs drifted — your action still applied. `reidentified` = the original ref was gone and the CLI found a unique replacement; double-check you hit the right element. 4. **Use the `compound` field for form controls.** Do not regex-guess a date format, do not `state` twice to get the full ``. 5. **Verify writes that matter.** After `type `, run `get value `. After `select`, run `get value`. Autocomplete widgets, React controlled inputs, and masked fields all silently eat characters. The CLI cannot detect this for you. 6. **`state` → action → `state` after a page change.** Navigations, form submits, and SPA route changes invalidate refs. Take a fresh snapshot. Do not reuse refs from before the transition. 7. **Chain with `&&`.** A chained sequence runs in one shell so refs acquired by the first command stay live for the second. Separate shell invocations lose the session context you just set up. 8. **`eval` is read-only.** Wrap the JS in an IIFE and return JSON. If you need to *change* the page, use the structured `click` / `type` / `select` / `keys` commands instead — they produce structured output and fingerprints, `eval` does not. 9. **Prefer `network` to screen-scraping.** If a page you care about fetches its data from a JSON API, the API is almost always more reliable than scraping the rendered DOM. Capture once, inspect the shape, then `--detail ` the body you need. --- ## Target contract (`` for click / type / select / get text|value|attributes) ``` ::= | ``` - **Numeric ref** — the `[N]` index from `state` or `find`. Cheap, resilient to soft DOM drift. - **CSS selector** — anything `querySelectorAll` accepts. Must be unambiguous on write ops, or pair with `--nth `. ### Envelope on success ```json { "clicked": true, "target": "3", "matches_n": 1, "match_level": "exact" } ``` ```json { "value": "kalevin@example.com", "matches_n": 1, "match_level": "stable" } ``` ### match_level | level | meaning | you should | |-------|---------|------------| | `exact` | Fingerprint agreed on tag + strong IDs with at most one soft drift | Proceed. | | `stable` | Tag + strong IDs still agree, soft signals (aria-label, role, text) drifted | Proceed, but if *what* you typed/clicked matters, re-check with `get value` or `state`. | | `reidentified` | Original ref was gone; a unique live element matched the fingerprint and was re-tagged with the old ref | Double-check you hit the right element before chaining more writes. | ### Structured error codes Branch on these, not on the human message: | code | meaning | |------|---------| | `not_found` | Numeric ref is no longer in the DOM. Re-`state`. | | `stale_ref` | Ref exists but the element at that ref changed identity. Re-`state`. | | `invalid_selector` | CSS was rejected by `querySelectorAll`. Fix the selector. | | `selector_not_found` | CSS matches 0 elements. Try `find` with a looser selector. | | `selector_ambiguous` | CSS matches >1 and no `--nth`. Add `--nth` or narrow the selector. | | `selector_nth_out_of_range` | `--nth` beyond match count. | | `option_not_found` | `select` couldn't find an option matching that label/value. Error envelope includes `available: string[]` of the real option labels. | | `not_a_select` | `select` was called on a non-` opencli browser find --css "select[name=country]" # the compound.options_total is 137, but compound.current is "" — unselected. opencli browser select 12 "Uruguay" opencli browser get value 12 # { value: "uy", match_level: "exact" } ``` ### Scrape a list via network instead of DOM ```bash opencli browser open "https://news.ycombinator.com" opencli browser network --filter "title,score" # -> find the /topstories entry, note its key opencli browser network --detail topstories-a1b2 ``` ### Read a long article in chunks ```bash opencli browser open "https://blog.example.com/long-post" opencli browser extract --chunk-size 8000 # -> content + next_start_char: 8000 opencli browser extract --start 8000 --chunk-size 8000 # ...until next_start_char is null ``` ### Cross-origin iframe ```bash opencli browser frames # -> [{"index": 0, "url": "https://checkout.stripe.com/...", ...}] opencli browser eval "(() => document.querySelector('input[name=cardnumber]')?.value)()" --frame 0 ``` --- ## Pitfalls - **Do not submit forms via `eval "document.forms[0].submit()"`** — modern sites intercept with JS handlers and silently drop the call. Either `click` the submit button via its ref, or (if you know the GET URL) just `open` it directly. - **Do not reuse refs across a page transition.** `wait` for the new state, then re-`state`. Old refs will either 404 or (worse) `reidentify` onto a similarly-shaped element on the new page. - **`match_level: reidentified` is a warning, not an error.** The action went through, but if you are chaining 5 more writes that all depend on that being the right element, verify with a `get text` or `get value` before continuing. - **Budget-aware commands silently cap.** `get html --as json` with default budgets will return `truncated: {...}`. If your downstream logic needs the whole subtree, raise `--depth` / `--children-max` or tighten the selector. - **`autocomplete: true` on a `type` response is not an error.** It means a suggestion popup is open and your value isn't committed yet. Typically `keys Enter` to accept the first suggestion, or `click` the one you want. - **`network --filter` is AND-semantics on path segments.** `--filter "title,score"` keeps entries whose body shape contains *both* `title` and `score` as path segments, at any depth. It is not a regex. - **Screenshots are for humans, not for agents.** Use `state` + `find` unless the page is genuinely visual (captcha, chart). Screenshots burn tokens and rarely add signal an agent can act on. --- ## Troubleshooting | symptom | fix | |---------|-----| | `opencli doctor` red: "Browser not connected" | Start Chrome with `--remote-debugging-port=9222`, or install the extension from the [Chrome Web Store](https://chromewebstore.google.com/detail/opencli/ildkmabpimmkaediidaifkhjpohdnifk). | | `attach failed: chrome-extension://...` | Disable 1Password / other CDP-hungry extensions temporarily. | | `selector_not_found` right after `state` | Page mutated. `wait selector "..."` then retry. | | `stale_ref` across every command | You are reusing refs from a prior page. Re-`state`. | | `click` succeeds but nothing happens | The element is probably a decorative wrapper stealing clicks from the real target. `find --css "..."` with a narrower selector and retry on the inner element. | | `type` appears to finish but value is wrong | Autocomplete, masked input, or React controlled re-render. Verify with `get value`. Add `keys Enter` or re-type. | | Giant `get html` output | Pass `--selector` + `--as json --depth 3 --children-max 20 --text-max 200`. | | Network cache seems stale | Bump `--ttl` down, or let it expire. The cache lives at `~/.opencli/cache/browser-network/`. | --- ## See also - `opencli-adapter-author` — turning what you just figured out into a reusable `~/.opencli/clis//.js`. - `opencli-autofix` — when an existing adapter breaks, this skill walks you through `OPENCLI_DIAGNOSTIC` and filing a fix.