# @aipexstudio/dom-snapshot A lightweight library for capturing DOM snapshots without relying on Chrome DevTools Protocol (CDP) Accessibility Tree (AXTree). This library provides a pure JavaScript/TypeScript solution for creating structured page snapshots that can be used for web automation, testing, and AI-powered browser agents. ## Why Not CDP AXTree? Traditional approaches to capturing page structure often rely on CDP's Accessibility Tree, which has several limitations: - **Browser dependency**: Requires Chrome/Chromium with DevTools Protocol - **Performance overhead**: CDP communication adds latency - **Complex setup**: Needs browser debugging port configuration - **Limited portability**: Doesn't work in all browser contexts This library takes a different approach by directly traversing the DOM and building a semantic snapshot that mimics accessibility tree structure, but works in any browser environment with just JavaScript. ## Features - **Pure DOM-based**: No CDP or browser extensions required - **Accessibility-aware**: Captures semantic roles, names, and states following ARIA patterns - **Interactive element focus**: Prioritizes buttons, links, inputs, and other actionable elements - **Hidden element filtering**: Automatically skips `aria-hidden`, `display:none`, `visibility:hidden`, and `inert` elements - **Stable node IDs**: Assigns persistent `data-aipex-nodeid` attributes for reliable element targeting - **Text content extraction**: Captures static text nodes for full page context - **Configurable options**: Control text length limits, hidden element inclusion, and text node capture - **Search functionality**: Built-in glob pattern search across snapshot text ## Installation ```bash npm install @aipexstudio/dom-snapshot # or pnpm add @aipexstudio/dom-snapshot ``` ## Usage ### Basic Snapshot Collection ```typescript import { collectDomSnapshot, collectDomSnapshotInPage } from '@aipexstudio/dom-snapshot'; // Collect snapshot from current page const snapshot = collectDomSnapshotInPage(); // Or specify a custom document const snapshot = collectDomSnapshot(document, { maxTextLength: 160, // Max characters for element text (default: 160, does not affect StaticText) includeHidden: false, // Include hidden elements (default: false) captureTextNodes: true, // Capture StaticText nodes (default: true) }); console.log(snapshot.totalNodes); // Total nodes captured console.log(snapshot.root); // Root node of the tree console.log(snapshot.idToNode); // Flat map of id -> node console.log(snapshot.metadata.url); // Page URL ``` ### Converting to Text Format ```typescript import { collectDomSnapshot, buildTextSnapshot, formatSnapshot } from '@aipexstudio/dom-snapshot'; // Collect raw snapshot const serialized = collectDomSnapshot(document); // Convert to TextSnapshot format const textSnapshot = buildTextSnapshot(serialized); // Format as readable text representation const formatted = formatSnapshot(textSnapshot); console.log(formatted); ``` Output example: ``` →uid=dom_abc123 RootWebArea "My Page" uid=dom_def456 button "Submit"