# Changelog All notable changes to turbo-surf are documented here. Format follows [Keep a Changelog](https://keepachangelog.com/); versions follow SemVer. ## [0.2.7] In-house solver maturity: a proper Cloudflare solve (run the challenge's own JS), Akamai experimental recon/rebuild tooling, and versioned encoding registries. ### Fixed - **Hyper Solutions adapter matched to the real API** (`hyper-sdk-go`) — was built on wrong assumptions. Correct now: `POST akm.hypersolutions.co/v2/sensor` with the `x-api-key` header and `{abck,bmsz,version,pageUrl,userAgent,script,acceptLanguage, ip}` body; the response `{payload}` is the `sensor_data` string, which turbo-surf POSTs to the target to harvest `_abck`. Akamai lane verified end-to-end (mock); other vendors return `Unsupported` to fall back. Scrapfly adapter validated against its docs (`/scrape?asp&render_js`, `result.cookies[].name/value`) — already correct. ### Added - **AWS WAF Bot Control solver** (`turbo-surf-core::aws_waf`, `TURBO_SURF_SOLVER=awswaf`) — the bot layer behind CloudFront / ALB. Classifies the tier (common / targeted `challenge.js` / captcha), runs `challenge.js` in the V8 tier (same `PowEngine` as Cloudflare) to mint the `aws-waf-token`, and replays it; the `captcha` tier routes to the browser sidecar. New `Vendor::AwsWaf` + detection (`x-amzn-waf-action`, `aws-waf-token`, `*.awswaf.com`). - **Universal browser fallback for in-house solvers** — `cloudflare`, `awswaf`, and `akamai` now all try their self-solve first and fall back to the browser sidecar when `TURBO_SURF_BROWSER_CMD` is set (via `FallbackSolver`), so a failed in-house solve still clears the wall. - **Proper Cloudflare solver (run the challenge's own JS)** — a `PowEngine` trait (core) implemented by the render tier (`turbo_surf_render::V8PowEngine`) lets `CloudflareSolver` execute the interstitial's challenge script in the V8 isolate and use the answer it computes, instead of the structural placeholder — the proper self-solve for CF's JS-compute challenge, no browser. Wired in the MCP session via `solver_from_env_pow`. - **Akamai experimental routing + `analyze_akamai` MCP tool** — the in-house Akamai solver is now flagged experimental and, when `TURBO_SURF_BROWSER_CMD` is set, routes through a `FallbackSolver` (try in-house → fall back to the browser sidecar). New `analyze_akamai` tool: probe the live Akamai script on the current page, hash it, build candidate `sensor_data` per stored version, and with `{retry:true}` POST each candidate, test live acceptance, and **save a working sensor locally** (`TURBO_SURF_SENSOR_DIR`). - **Versioned solver encodings** — both in-house solvers now store *multiple* generations of their challenge encoding behind a registry, since Akamai/CF shift format across versions. Akamai `SensorVersion` {V1 plaintext, V2 PRNG-shuffled, V3 encrypted-blob} via `generate_sensor_versioned`; Cloudflare `ChallengeVersion` {Iuam, Managed, Turnstile} via `detect_version` + `solve_pow_versioned` (Turnstile flagged non-self-solvable → routes to the browser sidecar). A harness test per vendor sweeps every stored version (deterministic + distinct + correctly tagged), so filling one version's real encoding keeps the rest green. Default tracks the latest generation. ## [0.2.6] **Look like a real Chrome on the wire.** The stock client sent a bare `turbo-surf/0.1` UA + a thin `Accept` and a generic rustls TLS/HTTP-2 fingerprint — an instant tell for WAFs. ### Added - **Chrome default headers** (default, rustls path, no new build deps) — every fetch now sends a current Chrome 149 (macOS) UA plus the full navigation header set (`accept`, `accept-language`, `sec-ch-ua`/`-mobile`/`-platform`, `sec-fetch-*`, `upgrade-insecure-requests`), values matched against a live real-Chrome capture. `accept-encoding` stays client-managed so auto-decompress still works; caller/crawl headers still override. - **`impersonate` feature** (opt-in, BoringSSL) — swaps the reqwest+rustls client for `wreq`/`wreq-util`, presenting a real Chrome TLS/JA3/JA4 + HTTP-2 (Akamai) fingerprint. Off by default (needs a C toolchain — cmake/nasm — to build); forwarded by `turbo-surf-{page,napi,mcp}`. A single `http_backend` alias in `turbo-surf-core` swaps the backend in one place. New live e2e (`tests/impersonate.rs`) asserts a Chrome JA4 + HTTP-2 fingerprint against a public echo (auto-skips offline); a localhost e2e asserts the Chrome headers reach the wire on the default path. - **Real Chrome `navigator` in the JS render tier** — `ENV_BOOTSTRAP` now installs a coherent Chrome 149 (macOS) `navigator` (UA, `platform`, `vendor`, `webdriver: false`, `hardwareConcurrency`/`deviceMemory`, a Chrome PDF plugin set) plus a `window.chrome`, and masks the JS polyfills' `Function.prototype. toString` so built-ins (`fetch`, `setTimeout`, …) report `[native code]`. Replaces the old `turbo-surf`/`turbo-test` tell page JS used to see. No-Chromium env emulation: satisfies passive/consistency anti-bot probes, not active canvas/WebGL/audio fingerprinting or PoW challenges. - **Fingerprint seed pool** (`turbo-surf-core::fingerprint`) — ~4000 internally coherent real-Chrome identities, selected deterministically by a client key (stable per client, spread across the fleet). Opt-in via `FetchOptions.profile`; the default reproduces the prior fixed Chrome-149/macOS wire behaviour. Raises the passive/consistency anti-bot bar; does not defeat active fingerprinting/PoW. - **Challenge-solver integration** (`turbo-surf-core::challenge`) — detect a JS- challenge/PoW wall (Akamai/DataDome/Kasada/Cloudflare) and hand it to a server- side solver (`ScrapflySolver`/`HyperSolver`) that returns tokens/cookies to replay. Configured via env / `.env` (`HYPER_API_KEY`, `SCRAPFLY_API_KEY`, `TURBO_SURF_SOLVER`, `TURBO_SURF_PROXY`); inert until a real key is set. See `.env.example`. Wired into the MCP session (per-host profile + auto solve/replay on a detected wall) and `TurboNavigator` (the crawl seam). Also a self-owned `BrowserSolver` (opt-in `TURBO_SURF_SOLVER=browser` + `TURBO_SURF_BROWSER_CMD`) that shells to a hardened-headless sidecar over a JSON contract — Chromium stays out of the engine; reference sidecar in `harness/browser-solver/`. MCP `stealth_status` tool reports the active profile + wired solver. - **In-house Akamai solver** (`turbo-surf-core::akamai`, `TURBO_SURF_SOLVER=akamai`, no key) — the first hand-written `ChallengeSolver`: `generate_sensor` builds a deterministic Akamai-shaped `sensor_data` payload, `AkamaiSolver` POSTs it to the sensor endpoint and parses the cleared `_abck`. Structure + POST/parse flow are tested + green; the dynamic field encoding a live edge validates still needs keying off a real `_abck` script (use the `probe` mode). - **In-house Cloudflare solver** (`turbo-surf-core::cloudflare`, `TURBO_SURF_SOLVER=cloudflare`, no key) — parse the managed-challenge interstitial (`window._cf_chl_opt`), solve its (JS-compute) PoW, POST to the challenge-platform endpoint, harvest `cf_clearance`. Structure + flow green; real per-version PoW math keyed off a live challenge. Turnstile-interactive stays on the browser sidecar. - **`probe-script` example** (`cargo run -p turbo-surf-render --example probe-script -- script.js`) — run the `probe` instrumentation over a real captured anti-bot script and print what it touched + the shim gaps. Run against a real Akamai sensor it surfaced two missing `navigator` props (`connection`, `userAgentData`), now added to the render-tier navigator (coherent with the UA). - **Runtime-controllable render fingerprint** — every render-tier `navigator` field (UA, platform, vendor, languages, hardwareConcurrency, deviceMemory, chromeMajor, connection, userAgentData, screen, devicePixelRatio) now has a Chrome 149 default and is overridable via `turbo_surf_render::set_fingerprint(json)` / the MCP `set_fingerprint` tool. `stealth_status` reports the active overrides. - **Fingerprint debug/probe mode** (`turbo-surf-render::probe_globals`, MCP `probe` tool) — run a page's JS with `navigator`/`screen`/`window.chrome`/canvas wrapped in logging proxies and report every property it touched + which reads returned `undefined` (the shim to-do list). Recon for what an anti-bot check probes and what's left to emulate. ## [0.2.5] A **pooled-render latency** fix on the JS-crawl fast path. ### Fixed - **Watchdog join latency in `render_page_pooled`** — the per-page execution-budget watchdog polled completion on a 2 ms sleep loop, so `watchdog.join()` after a healthy render blocked until the watchdog woke from its current sleep, adding up to 2 ms of latency to *every* pooled render. The watchdog now `park_timeout`s on the budget deadline and is `unpark`ed the instant the render completes, so a healthy render's join returns in µs (an elapsed-guard loop survives spurious wakeups; the deadline still terminates a runaway script). Measured on `quotes.toscrape.com/js` (warm pool): **renderPooled 2.6 ms → 1.3 ms (−50%)**, output byte-identical. The politeness/network- bound crawl wall is unchanged; CPU/parallel render throughput ~doubles. ## [0.2.4] A **Linux SIGBUS** fix in the Playwright-shim test harness, plus a new **Python (PyPI) binding**. ### Fixed - **SIGBUS on Linux running the shim suite** (#6) — root cause was a bug in the shim's fake `@playwright/test` harness: `test.describe(...)` was registered as a node:test *test* instead of a *suite*, so the nested `test(...)` calls in a describe body fired on the global runner while the parent test was still running. node:test cancelled them ("test did not finish before its parent and was cancelled"), and the dangling async test — still holding a live-session V8 isolate — was torn down at process exit, which faulted with SIGBUS on Linux (macOS tolerated it). Latent until v0.2.3 wired a real `npm test`, so the multi-file shim run never executed on CI before. `makeDescribe` now registers a real node:test suite, so nested tests are awaited and torn down cleanly. - **V8 platform init hardening** (defense-in-depth) — `ensure_platform()` initializes the V8 platform once on a dedicated, parked keeper thread (deno_core otherwise inits it lazily on whichever thread builds the first runtime; a transient one that then exits orphans the platform). Called before any worker isolate is created from `evaluate`, `render`, `render_pooled`, `hydrate`, and `live_open`. ### Added - **Python binding (`turbo-surf` on PyPI)** — a PyO3 abi3 wheel (CPython 3.8+) exposing the stateless parse → view/extract → JS-render surface (`markdown`, `text`, `links`, `query`, `extract`, `evaluate`, `render`, `transform`, …), mirroring the Node N-API functions. New crate `rust/crates/turbo-surf-py`; `release-py.yml` builds + publishes wheels on a `pyv*` tag (gated on a `PYPI_TOKEN` secret). A real `test` npm script + the stale shim-assertion fixes from v0.2.3's CI work are included. ## [0.2.3] A **JS-render speed** pass on the crawl path: the render tier built a fresh V8 isolate per page (boot + the ~90 KB env bootstrap + parse dominate), so a JS-mode crawl paid the full isolate boot on every page. A pooled fast path reuses one isolate across pages — boot is paid once per worker thread — with a cross-page global scrub so a reused isolate still renders like a fresh navigation. **11.5 ms → 3.2 ms per page** on `quotes.toscrape.com/js` (3.6×), output byte-identical to the fresh render. ### Added - **Pooled render fast path** — `render_page_pooled` (Rust) / `renderPooled` (napi) reuses a thread-local V8 isolate across pages, on a persistent render worker (one long-lived thread + one reused tokio runtime). Per-page session repoint (base/cookies/UA) + a global scrub (`SCRUB_GLOBALS`) restore fresh-navigation semantics; a budget-terminated/errored runtime is dropped instead of repooled. The competitive JS adapter drives it; `render` (fully isolated, fresh-per-page) is unchanged for correctness-sensitive callers. - **`harness/hotpath/render-bench.mjs`** — reusable offline profiler for `native.render` / `renderPooled` (faithful script extraction, cached sample, A/B + parity check). ### Notes - Cross-page isolation is intentionally relaxed for crawl speed (matching the existing `EVAL_RT` stance): the scrub reverts page-ADDED globals, not builtins mutated in place. - A V8 code cache for the bootstrap + page bundle was tried and reverted — with a fresh isolate per page, `ConsumeCodeCache` costs more than a re-parse. Isolate reuse is the real lever. ## [0.2.2] The headless **Playwright-shim parity** push: the payroll-app Playwright e2e suite now runs through the browserless shim (over the napi addon, **no Chromium**), driving a real authenticated Next.js App Router SPA. Side-by-siding every failure against real Chromium (reseeded per suite) drove the engine to parity — the suite's remaining reds all reproduce in Chromium too (app/backend/test data, not the engine). See `HEADLESS-HYDRATION.md` for the full record. ### Added - **ES module support in the render tier** — `