--- name: nano-pdf cluster: community description: "PDF processing: extraction, text mining, form filling, manipulation, OCR integration" tags: ["pdf","document","extraction","ocr"] dependencies: [] composes: [] similar_to: [] called_by: [] authorization_required: false scope: general model_hint: claude-sonnet embedding_hint: "pdf extract text mine form fill manipulate ocr" --- # nano-pdf ## Purpose This skill provides tools for PDF processing, including text extraction, mining, form filling, manipulation, and OCR integration, to handle document workflows efficiently. ## When to Use Use this skill for tasks involving PDF data extraction (e.g., from scanned documents), text analysis in reports, automating form submissions, merging/splitting files, or applying OCR to non-text PDFs. Apply it in data pipelines, document automation scripts, or when integrating with OCR services for unstructured data. ## Key Capabilities - Text extraction: Pulls plain text or structured data from PDFs, supporting encrypted files with passwords; uses OCR via Tesseract integration for image-based PDFs. - Text mining: Analyzes extracted text for keywords, sentiment, or patterns; e.g., counts occurrences of phrases in a document. - Form filling: Populates interactive PDF forms with JSON data; supports flattening forms to static PDFs. - Manipulation: Merges, splits, rotates, or watermarks PDFs; handles up to 500-page documents efficiently. - OCR integration: Converts scanned PDFs to searchable text using external APIs; requires Tesseract or similar engine configuration. ## Usage Patterns Invoke via CLI for quick scripts or API for server-side integration. For batch processing, chain commands in a shell script; for web apps, use API calls in loops. Always specify input/output paths explicitly. Pattern: Extract text first, then mine or manipulate as needed. For OCR-heavy tasks, preprocess images before PDF operations. ## Common Commands/API CLI commands use `nano-pdf` binary; API endpoints are under `https://api.opencclaw.com/nano-pdf/`. Authentication requires `$NANO_PDF_API_KEY` environment variable. - Extract text: `nano-pdf extract --file input.pdf --output text.txt --ocr true` (adds OCR if text is not selectable). - Mine text: `nano-pdf mine --input text.txt --keywords "AI,robot" --output results.json` (outputs keyword frequencies). - Fill form: `nano-pdf fill --template form.pdf --data '{"field1": "value"}' --output filled.pdf`. - Manipulate PDF: `nano-pdf merge --files file1.pdf file2.pdf --output combined.pdf`. - API endpoint for extraction: POST /extract with body `{"file": "base64encoded_content", "ocr": true}` and header `Authorization: Bearer $NANO_PDF_API_KEY`. - Code snippet (Python): ``` import requests response = requests.post('https://api.opencclaw.com/nano-pdf/extract', headers={'Authorization': f'Bearer {os.environ["NANO_PDF_API_KEY"]}'}, json={'file': 'base64data'}) print(response.json()['text']) ``` - Config format: JSON for API bodies, e.g., `{"file": "path", "options": {"ocr_engine": "tesseract", "language": "en"}}`; CLI uses flag-based configs like `--config config.json`. ## Integration Notes Integrate by setting `$NANO_PDF_API_KEY` for authenticated requests; for local use, install via `pip install nano-pdf` and import as a module. Combine with other tools: pipe CLI output to NLP libraries for mining, or use in Node.js via HTTP requests. For OCR, ensure Tesseract is installed and configured in your environment path. Test integrations in a sandbox to verify API rate limits (e.g., 100 requests/min). ## Error Handling Check for common errors like file not found (exit code 404), invalid API keys (401), or OCR failures (e.g., no Tesseract installed). Use try-except in code: ``` try: result = nano_pdf.extract('input.pdf') except FileNotFoundError: print("Error: File does not exist.") except Exception as e: print(f"API Error: {e} - Check $NANO_PDF_API_KEY.") ``` For CLI, parse stderr output; retry transient errors (e.g., network issues) with exponential backoff. Always validate inputs, like ensuring PDFs are not corrupted before processing. ## Example 1: Extract and Mine Text from a PDF To extract text from a scanned invoice PDF and mine for product names: 1. Run: `nano-pdf extract --file invoice.pdf --output invoice_text.txt --ocr true` 2. Then: `nano-pdf mine --input invoice_text.txt --keywords "product" --output analysis.json` This produces a JSON with keyword occurrences for further processing. ## Example 2: Fill and Manipulate a Form PDF To fill a job application form and merge it with a cover letter: 1. Prepare data in JSON: `{"name": "John Doe", "position": "Engineer"}` 2. Execute: `nano-pdf fill --template application.pdf --data application_data.json --output filled_app.pdf` 3. Merge: `nano-pdf merge --files filled_app.pdf cover_letter.pdf --output final_packet.pdf` Output is a single PDF ready for submission. ## Graph Relationships - Related to: "ocr-tool" (for enhanced OCR capabilities), "document-parser" (for broader file type support), "text-analyzer" (for advanced mining integrations). - Clusters: Connected via "community" cluster to skills like "data-extraction" and "automation-utils".