# @aipexstudio/dom-snapshot
A lightweight library for capturing DOM snapshots without relying on Chrome DevTools Protocol (CDP) Accessibility Tree (AXTree). This library provides a pure JavaScript/TypeScript solution for creating structured page snapshots that can be used for web automation, testing, and AI-powered browser agents.
## Why Not CDP AXTree?
Traditional approaches to capturing page structure often rely on CDP's Accessibility Tree, which has several limitations:
- **Browser dependency**: Requires Chrome/Chromium with DevTools Protocol
- **Performance overhead**: CDP communication adds latency
- **Complex setup**: Needs browser debugging port configuration
- **Limited portability**: Doesn't work in all browser contexts
This library takes a different approach by directly traversing the DOM and building a semantic snapshot that mimics accessibility tree structure, but works in any browser environment with just JavaScript.
## Features
- **Pure DOM-based**: No CDP or browser extensions required
- **Accessibility-aware**: Captures semantic roles, names, and states following ARIA patterns
- **Interactive element focus**: Prioritizes buttons, links, inputs, and other actionable elements
- **Hidden element filtering**: Automatically skips `aria-hidden`, `display:none`, `visibility:hidden`, and `inert` elements
- **Stable node IDs**: Assigns persistent `data-aipex-nodeid` attributes for reliable element targeting
- **Text content extraction**: Captures static text nodes for full page context
- **Configurable options**: Control text length limits, hidden element inclusion, and text node capture
- **Search functionality**: Built-in glob pattern search across snapshot text
## Installation
```bash
npm install @aipexstudio/dom-snapshot
# or
pnpm add @aipexstudio/dom-snapshot
```
## Usage
### Basic Snapshot Collection
```typescript
import { collectDomSnapshot, collectDomSnapshotInPage } from '@aipexstudio/dom-snapshot';
// Collect snapshot from current page
const snapshot = collectDomSnapshotInPage();
// Or specify a custom document
const snapshot = collectDomSnapshot(document, {
maxTextLength: 160, // Max characters for element text (default: 160, does not affect StaticText)
includeHidden: false, // Include hidden elements (default: false)
captureTextNodes: true, // Capture StaticText nodes (default: true)
});
console.log(snapshot.totalNodes); // Total nodes captured
console.log(snapshot.root); // Root node of the tree
console.log(snapshot.idToNode); // Flat map of id -> node
console.log(snapshot.metadata.url); // Page URL
```
### Converting to Text Format
```typescript
import { collectDomSnapshot, buildTextSnapshot, formatSnapshot } from '@aipexstudio/dom-snapshot';
// Collect raw snapshot
const serialized = collectDomSnapshot(document);
// Convert to TextSnapshot format
const textSnapshot = buildTextSnapshot(serialized);
// Format as readable text representation
const formatted = formatSnapshot(textSnapshot);
console.log(formatted);
```
Output example:
```
→uid=dom_abc123 RootWebArea "My Page"