crawlex Copyright (c) 2026 Filipe Forattini This product includes software developed by third parties as listed below. ================================================================================ Firecrawl — https://github.com/firecrawl/firecrawl MIT License — Copyright (c) Sideguide Technologies Inc. Portions of `src/extract/html_clean.rs`, `src/extract/link_filter.rs`, and `src/extract/sitemap.rs` are ports of Rust-native code from `apps/api/native/src/{html.rs, crawler.rs}` in Firecrawl. The original logic — including the EXCLUDE_NON_MAIN_TAGS selector list, post-markdown cleanup rules, link filter heuristics, and sitemap XML parser — retains the upstream MIT license. ================================================================================ FingerprintJS — https://github.com/fingerprintjs/fingerprintjs MIT License — Copyright (c) FingerprintJS Inc. Test fixtures in `tests/fpjs_compliance.rs` (font list, WebGL parameter keys, math golden vectors, DOM blocker selectors) are derived from the open-source FingerprintJS v3 source code at `src/sources/*.ts`. The original files retain the upstream MIT license; we use them as a compliance target for our stealth shim and do not redistribute them as-is. ================================================================================ BoringSSL (via `boring` crate) — https://boringssl.googlesource.com/boringssl ISC / OpenSSL-style license Linked through the `boring` and `boring-sys` crates. ================================================================================ curl-impersonate — https://github.com/lwthiker/curl-impersonate MIT License — Copyright (c) 2022 lwthiker Per-browser-version `tls_client_hello` YAML signatures from upstream v0.6.1-3 (`tests/signatures/{chrome,edge,firefox,safari}.yaml`) are vendored under `src/impersonate/catalog/vendored/` and read by `build.rs` at compile time to populate the static `TlsFingerprint` catalog. The upstream MIT license text is preserved verbatim at `src/impersonate/catalog/vendored/LICENSE-curl-impersonate`. ================================================================================ vercel-labs/agent-browser — https://github.com/vercel-labs/agent-browser Apache License 2.0 The design of `src/policy/action_policy.rs` (per-verb allow/deny/confirm with JSON load + default fallback) is inspired by `cli/src/native/policy.rs` in agent-browser. Not a line-for-line port; the Rust types, serde shape, and tests are original, written to plug into crawlex's own `PolicyEngine` and NDJSON event envelope. The conceptual debt is acknowledged here. ================================================================================ chromiumoxide — https://github.com/mattsse/chromiumoxide MIT License — Copyright (c) 2020 Matthias Seitz Apache License 2.0 The Chrome DevTools Protocol driver under `src/render/chrome/`, `src/render/chrome_protocol/`, `src/render/chrome_fetcher/`, and `src/render/chrome_wire.rs` is derived from chromiumoxide (0.9.x + master post-0.9.1 commits, upstream as of rev afcc3a4313f2). The upstream crate was incorporated in-tree and desmembrado into first-party modules rather than consumed as an external dependency, so we can patch CDP-schema drift (Chrome 149 removed `ClientSecurityState.privateNetworkRequestPolicy`, renamed `Page.lifecycleEvent[init]` to `commit`, etc.) and apply stealth patches (Runtime.Enable absence, isolated-world context resolution) on our own cadence without maintaining a separate fork. The original dual-licensed terms are preserved verbatim in `src/render/LICENSES/{MIT,APACHE,NOTICE}`. The code in those directories has been substantially modified — see `git log src/render/chrome` for the patch history. ================================================================================ Additional third-party notices for code embedded under `src/render/` are kept with each module in `src/render/LICENSES/`.