--- name: summarize description: Summarize any content (YouTube video, article, PDF, EPUB, podcast) into a rich Obsidian note with section-by-section breakdowns and optional wikilinks. Handles messy PDFs, scanned documents, and complex layouts. Use when the user provides a URL, file, or content to summarize. user_invocable: true --- # Summarize Universal content summarizer. Takes any input -- YouTube video, web article, PDF, EPUB book, podcast, lecture -- and produces a detailed, well-structured Obsidian summary note. Forked from [reysu/ai-life-skills](https://github.com/reysu/ai-life-skills) with improvements: - Better PDF extraction (marker > pdftotext fallback chain) - Simpler vault structure - Reduced reference note sprawl (only key people and major concepts) - Fast-path mode for quick overviews - Handles scanned/messy PDFs via OCR fallback ## Requirements **Vault structure** -- the skill uses a simple flat layout. Override folder names in Configuration below. | Folder | Purpose | |---|---| | `Summaries/` | Where summary notes land | | `References/` | Concept / company / product notes (only major ones) | | `People/` | Person notes | | `Transcripts/` | Raw transcripts for audio/video content | | `_Attachments/` | Where you drop source files (ebooks, PDFs) before summarizing | | `_Templates/` | Note templates | **CLI tools** -- install before first use, or let Step 0 walk you through it: | Tool | Purpose | Install | Required? | |---|---|---|---| | `yt-dlp` | YouTube/podcast download + metadata + subs | `brew install yt-dlp` or `pip install yt-dlp` | For YouTube/podcasts | | `defuddle` | Web article extraction | `npm install -g defuddle` | For web articles | | `marker_single` | PDF text extraction (handles complex layouts, scans, OCR) | `pip install marker-pdf` | For PDFs (primary) | | `pdftotext` | PDF text extraction (simple/clean PDFs) | `brew install poppler` | For PDFs (fallback) | | `pandoc` | EPUB / DOCX to markdown | `brew install pandoc` | For ebooks | | `mlx_whisper` (optional) | Local audio transcription fallback | `pip install mlx-whisper` | Optional | ## Configuration Override via environment variables or edit defaults here: ``` VAULT_ROOT = $VAULT_ROOT # auto-detected if not set (walks up looking for .obsidian/) SUMMARIES_DIR = Summaries REFERENCES_DIR = References PEOPLE_DIR = People TRANSCRIPTS_DIR = Transcripts TEMPLATES_DIR = _Templates ATTACHMENTS_DIR = _Attachments ``` All paths are relative to `$VAULT_ROOT`. ## Trigger When the user provides content to summarize: a URL (YouTube, article, blog), a PDF/file path, pasted text, or a reference to content already in the vault. ## Inputs - **Source**: URL, file path, or pasted text - **Mode** (optional): defaults to "full" - `full` -- chapter-by-chapter or section-by-section detailed breakdown (default) - `quick` -- 1-2 page overview, skip reference note creation - `deep` -- extra detail, include more quotes, create all reference notes - **Audience** (optional): defaults to "general reader." User may specify (e.g. "technical", "eli5") ## Step 0: Bootstrap check (first run only) Verify the environment is ready. **Skip any check that already passes.** Do not re-run on subsequent invocations if setup succeeded. ### 0a. Resolve the vault root ```bash vault="" if [ -n "$VAULT_ROOT" ]; then vault="$VAULT_ROOT" else dir="$PWD" while [ "$dir" != "/" ]; do if [ -d "$dir/.obsidian" ]; then vault="$dir"; break; fi dir="$(dirname "$dir")" done fi echo "Vault: ${vault:-NOT FOUND}" ``` If no vault found, ask: **"What's the absolute path to your Obsidian vault?"** ### 0b. Check required folders ```bash for d in "$SUMMARIES_DIR" "$REFERENCES_DIR" "$PEOPLE_DIR" "$TRANSCRIPTS_DIR" "$TEMPLATES_DIR" "$ATTACHMENTS_DIR"; do [ -d "$VAULT_ROOT/$d" ] || echo "MISSING: $d" done ``` For each missing folder, ask before creating: **"Create `` in your vault? [y/N]"** ### 0c. Check required CLI tools ```bash for tool in yt-dlp defuddle pdftotext pandoc; do command -v "$tool" >/dev/null 2>&1 || echo "MISSING: $tool" done # marker is optional but preferred -- check separately command -v marker_single >/dev/null 2>&1 || echo "OPTIONAL (recommended): marker_single — pip install marker-pdf" ``` Ask before installing anything. If user declines, note which content types will be affected. ### 0d. Install person template if missing If `$VAULT_ROOT/$TEMPLATES_DIR/new person template.md` doesn't exist, copy the minimal template from this skill's `templates/` folder into the user's `_Templates/` directory. ## Step 1: Detect content type and extract text ### YouTube video ```bash # Get metadata yt-dlp --cookies-from-browser chrome \ --print "%(id)s|%(title)s|%(duration)s|%(upload_date)s|%(view_count)s|%(channel)s|%(channel_id)s" \ --no-download "" # Try auto-subtitles first (fastest) yt-dlp --cookies-from-browser chrome \ --write-auto-sub --sub-lang en --sub-format json3 \ --skip-download -o "/tmp/summarize/%(id)s" "" ``` If auto-subs exist, extract text from JSON3. If not or quality is poor, download audio and transcribe (mlx_whisper or ElevenLabs if `ELEVENLABS_API_KEY` is set). ### Web article / blog post ```bash defuddle parse "" --md -o /tmp/summarize/article.md defuddle parse "" -p title defuddle parse "" -p domain ``` ### PDF (improved extraction pipeline) **Try marker first** (handles complex layouts, multi-column, scans, OCR): ```bash if command -v marker_single >/dev/null 2>&1; then marker_single "" --output_dir /tmp/summarize/ # marker outputs markdown with preserved structure else # Fallback to pdftotext pdftotext "" /tmp/summarize/paper.txt fi ``` **Quality check after extraction:** ```bash # Check if extraction produced reasonable output wc -w /tmp/summarize/paper.txt # or the marker output # If word count is suspiciously low relative to page count, warn the user: # "The PDF extraction looks sparse -- this might be a scanned document. # marker-pdf handles these better. Want me to install it? (pip install marker-pdf)" ``` **For PDFs that are actually books** (>100 pages, or user says it's a book): - Use marker for extraction, then apply the book/chapter splitting strategy from the EPUB section below ### EPUB (books) ```bash # Extract full text as markdown (preserves chapter structure) pandoc "" -t markdown --wrap=none -o /tmp/summarize/book.md # Extract TOC for chapter boundaries: pandoc "" -t json | python3 -c " import json, sys doc = json.load(sys.stdin) for block in doc['blocks']: if block['t'] == 'Header': level = block['c'][0] text = ''.join( item['c'] if item['t'] == 'Str' else ' ' if item['t'] == 'Space' else '' for item in block['c'][2] ) print(f'L{level}: {text}') " ``` **Chapter splitting strategy for books:** 1. Extract full text with `pandoc` (or `marker` for PDF books) 2. Identify chapter boundaries from headers 3. Split into one chunk per chapter 4. Dispatch parallel Opus subagents -- one per chapter 5. Typical book (60-100k words, 15-30 chapters) produces chapters of ~3-5k words each **For very long books (>30 chapters):** batch chapters into groups of ~5 per subagent. ### Other files - `.docx`: `pandoc "" -t markdown --wrap=none -o /tmp/summarize/doc.md` - Plain text: read directly - Pasted text / vault note: read directly ## Step 1b: Save transcript (audio/video content only) For spoken-word content (YouTube, podcasts, lectures), save the transcript as a permanent note. **Location:** `$TRANSCRIPTS_DIR/ Transcript.md` ```markdown --- date: YYYY-MM-DD duration: <seconds> source: "<URL>" summary_note: "[[<Summary Note Title>]]" unread: true --- [Full timestamped transcript text] ``` **Do NOT create transcripts for:** articles, blog posts, PDFs, books, pasted text. ## Step 2: Determine output structure | Content type | Location | Tag | Key frontmatter | |---|---|---|---| | YouTube video | `Summaries/<Channel>/<Title>.md` | `youtube` | `source`, `views`, `creator`, `duration`, `uploaded`, `transcript` | | Article / blog | `Summaries/<Title>.md` | `article` | `creator`, `source` (URL), `published` | | PDF / paper | `Summaries/<Title>.md` | `paper` | `authors`, `source`, `published` | | EPUB / book | `Summaries/<Title>.md` | `book` | `creator`, `published`, `isbn`, `source` | | Podcast | `Summaries/<Show>/<Title>.md` | `podcast` | `source`, `creator`, `duration`, `transcript` | **All notes** get: `created`, `updated`, `date`, `summary` (1-line), `unread: true` ## Step 3: Analyze structure and determine depth ### 3a. Summary depth from source length | Source word count | Examples | Target summary words | Sections | TLDR | |---|---|---|---|---| | <1,500 | 5-min video, short article | 200-400 | 1-2 | 2 sentences | | 1,500-5,000 | 10-20 min video, blog post | 500-1,200 | 3-5 | 3 sentences | | 5,000-15,000 | 30-60 min video, long article | 1,500-3,000 | 5-8 | 3-4 sentences | | 15,000-40,000 | 1-3 hr video/podcast | 3,000-6,000 | 8-15 | 4-5 sentences | | 40,000-80,000 | Short book | 5,000-10,000 | 15-25 | 5 sentences | | 80,000+ | Full book (200+ pages) | 8,000-15,000 | 20-40 | 5 sentences | **Ratio is roughly 1:5 to 1:10.** Denser content skews higher; conversational/repetitive skews lower. **Quick mode override:** If user requested `quick`, produce only the TLDR + a 500-word max overview regardless of source length. Skip reference note creation entirely. **Per-section depth**: each section's word budget is proportional to its share of the source material. ### 3b. Plan sections and dispatch **For long content (>3000 source words):** dispatch parallel **Opus** subagents -- one per section. Each subagent gets: - The section text - The audience level - A specific word count target - Instructions to use `[[wikilinks]]` for key people, major concepts, and important terms (NOT every proper noun) **For short content (<3000 source words):** summarize directly without subagents. ### Book-specific depth requirements - Each chapter MUST get its own `## Chapter N: Title` section with 300-600 words - Include key arguments, data points, examples, and notable quotes - Do NOT collapse multiple chapters into a single paragraph - A 10-chapter book = ~3000-6000 words of summary - A 30-chapter book = ~5000-10000 words - Each chapter summary should standalone -- someone reading it should understand what that chapter argues ## Step 4: Assemble the summary note ### Structure ```markdown --- [frontmatter per Step 2] --- [embed if applicable: ![[file.pdf]], source URL, etc.] > [!tldr] > [Overview -- sentence count per depth table. What is it about, who made it, key takeaways.] ## [Section 1 Title] [Summary paragraphs with [[wikilinks]] to key concepts and people] ## [Section 2 Title] [...] ## People Mentioned - [[Person Name]] -- brief context ``` ### Formatting rules 1. **No `# Title` heading** -- filename is the title 2. **Never repeat frontmatter in the body** 3. **`> [!tldr]`** for the overview 4. **`> [!quote]`** callouts for notable quotes (with speaker and source location) 5. **Wikilink selectively** -- people, major concepts, companies, and terms worth revisiting. NOT every proper noun or common term. Ask: "Would I ever want a note about this?" If no, don't wikilink it. 6. **Timestamps** on headings and quotes when available (YouTube, podcasts) ### Audience adaptation - **ELI5 / beginner**: plain language, analogies, explain jargon before using it - **General reader**: balanced -- explain key terms but don't over-simplify - **Technical / expert**: technical language fine, focus on novel contributions ## Step 5: Create reference notes (selective, one layer deep) **This step is skipped entirely in `quick` mode.** In `full` mode, create notes only for: - **People** who are central to the content (authors, creators, key figures discussed at length) - **Major concepts** that are core to the content's argument (not every term mentioned in passing) - **Companies/products** that play a significant role In `deep` mode, create notes for everything wikilinked. **Rule of thumb for `full` mode:** a 10-chapter book should generate roughly 5-15 reference/people notes, not 50+. ### 5a. Extract and audit wikilinks ```bash grep -o '\[\[[^]|]*' "<summary_note_path>" | sed 's/\[\[//' | sort -u ``` Check which are missing: ```bash for term in <each extracted term>; do found=$(find "$VAULT_ROOT" -name "$term.md" \ -not -path "*/.Trash/*" 2>/dev/null | head -1) if [ -z "$found" ]; then echo "MISSING: $term"; fi done ``` **Always run the audit. Do not guess which notes exist.** ### 5b. Create missing notes #### Concepts, companies, products Create in `References/<Term>.md`: ```markdown --- created: YYYY-MM-DDTHH:mm updated: YYYY-MM-DDTHH:mm type: reference unread: true --- [2-4 sentence plain-language explanation. Cross-reference related concepts with [[wikilinks]].] ``` #### People Create in `People/<Full Name>.md` using the person template. Conventions: - **Public figures**: research and write a substantive bio (career, key facts, why they matter) - **Private individuals**: minimal note with only what's known from the content - **No `# Title` heading** -- Obsidian shows filename as title - **`unread: true`** in frontmatter #### Dispatch in parallel For >10 missing notes, use parallel **Opus** subagents in batches of ~20-25. ### 5c. Verify -- no dangling links Re-run the audit from 5a. If any remain, create them. **Summary is not done until verification passes** (except in `quick` mode where wikilinks are allowed to dangle). ## Step 6: Final output Write the summary note to `$VAULT_ROOT/$SUMMARIES_DIR/`. Report to the user: - Summary note path - Number of reference/people notes created (if any) - Any issues encountered (extraction problems, missing tools, etc.) ## Model usage | Task | Model | |------|-------| | Content extraction | Scripts (marker, pdftotext, defuddle, yt-dlp, pandoc) | | Section summarization | **Opus** subagents (parallel) | | Reference note creation | **Opus** subagents (parallel batches) | | Person note creation | **Opus** | ## Key rules 1. **Wikilink selectively** -- key people, major concepts, companies. Not every noun. 2. **One layer deep** -- create reference/people notes for wikilinked terms missing notes (full/deep mode only) 3. **No `# Title` headings** -- Obsidian shows filename as title 4. **Never repeat frontmatter in body** 5. **Set `unread: true`** on every note created or modified 6. **Parallel Opus subagents** for long content 7. **Audience-appropriate language** 8. **Always embed/link the source** in frontmatter 9. **`> [!tldr]` is mandatory** -- every summary starts with a concise overview 10. **Respect the depth table** -- summary length must be proportional to source length 11. **Quick mode exists** -- when user wants speed over completeness, honor it