--- name: adding-notes description: Add new notes to the Second Brain knowledge base. Use when the user provides a resource (URL, book, podcast, article, GitHub repo, Reddit thread) and asks to "add a note", "create a note", "save this", "add to my notes", "take notes on", or "capture this". allowed-tools: Read, Write, Bash, WebFetch, Glob, Grep, Task, TaskOutput, WebSearch, AskUserQuestion --- # Adding Notes to Second Brain Add content to the knowledge base with proper frontmatter, tags, summaries, and wiki-links. ## Content Type Routing Detect type from URL, then load the appropriate reference file. | URL Pattern | Type | Reference | |-------------|------|-----------| | youtube.com | See [YouTube Classification](#youtube-classification) | `references/content-types/youtube.md` or `talk.md` or `podcast.md` | | reddit.com | reddit | `references/content-types/reddit.md` | | github.com | github | `references/content-types/github.md` | | imdb.com/title/, themoviedb.org/movie/ | movie | `references/content-types/movie.md` | | goodreads.com/series/ | manga | `references/content-types/manga.md` | | goodreads.com, amazon.com (books) | book | `references/content-types/book.md` | | spotify.com/episode, podcasts.apple.com | podcast | `references/content-types/podcast.md` | | udemy.com, coursera.org, skillshare.com | course | `references/content-types/course.md` | | *.substack.com/p/*, *.beehiiv.com/p/*, buttondown.email/* | newsletter | `references/content-types/newsletter.md` | | Other URLs | article | `references/content-types/article.md` | | No URL | note | `references/content-types/note.md` | | Manual: `quote` | quote | `references/content-types/quote.md` | | Manual: `evergreen` | evergreen | `references/content-types/evergreen.md` | | Manual: `map` | map | `references/content-types/map.md` | ### YouTube Classification YouTube URLs require sub-classification before processing: 1. **Known podcast channel?** → `references/content-types/podcast.md` 2. **Known talk channel OR conference title?** → `references/content-types/talk.md` 3. **Tutorial signals?** → `references/content-types/youtube.md` with `isTechnical: true` 4. **Default** → `references/content-types/youtube.md` See `references/content-types/youtube.md` for full classification logic and channel lists. --- ## Scripts Reference Only use scripts that fetch external data or perform complex processing: | Script | Purpose | |--------|---------| | `get-youtube-metadata.sh URL` | Video title, channel | | `get-youtube-transcript.py URL [--format FORMAT]` | Video transcript (see formats below) | | `get-podcast-transcript.py [opts]` | Podcast transcript | | `get-reddit-thread.py URL --comments N` | Thread + comments | | `get-goodreads-metadata.sh URL` | Book metadata | | `get-manga-metadata.sh URL` | Manga series data | | `get-github-metadata.sh URL` | Repo stats | | `find-related-notes.ts FILE [--limit N] [--min-score N]` | Semantic search using project embeddings | ### Transcript Format Options ```bash get-youtube-transcript.py URL # plain (default) - single blob get-youtube-transcript.py URL --format sentences # one sentence per line (grep-friendly) get-youtube-transcript.py URL --format timestamped # [MM:SS] per segment get-youtube-transcript.py URL --format json # full metadata with timestamps ``` **Recommended:** Use `--format sentences` for large transcripts—enables grep/search and chunked reading. **Do NOT use scripts for trivial operations** — do them inline: - Author check: `Glob` with `content/authors/*{lastname}*.md` - Frontmatter: Write YAML directly - Tag lookup: `Grep` or knowledge from prior notes --- ## Workflow Phases ```text Phase 1: Type Detection → Route to content-type file Phase 2: Parallel Metadata Collection → Per-type agents Phase 2.5: Large Transcript Handling → Subagent for >10K token transcripts Phase 3: Author Creation → See references/author-creation.md Phase 4: Content Generation → Apply writing-style, generate body Phase 4.25: Diagram Evaluation → REQUIRED visual assessment with logged outcome Phase 4.5: Connection Discovery → Find genuine wiki-link candidates (if any exist) Phase 5: Quality Validation → Parallel validators Phase 6: Save Note → Write to content/{slug}.md with link density report Phase 7: MOC Placement → Suggest placements + check MOC threshold Phase 8: Quality Check → Run pnpm lint:fix && pnpm typecheck ``` ### Phase 1: Type Detection & Dispatch 1. **Detect type from URL** using the Content Type Routing table above (no script needed) 2. **Load the content-type reference file** for detailed handling 3. Detect `isTechnical` flag (see content-type file for criteria) ### Phase 2: Metadata Collection Spawn parallel agents as specified in the content-type file. Each file lists: - Required scripts to run - Agent configuration - Special handling notes **If `isTechnical: true`:** Also spawn code extraction agent (see `references/code-extraction.md`). ### Phase 2.5: Large Transcript Handling For podcasts/videos with transcripts >10K tokens, use a dedicated subagent instead of reading directly. **Detection:** If transcript file exceeds 50KB or initial read fails with token limit error. **Option A: Transcript Analysis Subagent (Recommended)** Spawn a Task with `subagent_type: general-purpose`: ```text Analyze this transcript and extract structured content for a knowledge base note. **Instructions:** 1. Read the transcript file at: {transcript_path} 2. Extract and return: ## Timestamps | Time | Topic | |------|-------| (Major topic shifts with approximate times) ## Key Arguments (3-5 main claims with supporting reasoning, 2-3 sentences each) ## Notable Quotes (4-6 verbatim quotes that capture core ideas, with speaker attribution) ## Named Frameworks (Any models, principles, or processes given specific names) ## Diagram Candidates (Any process, system, or framework worth visualizing) **Output:** Structured markdown, max 1500 words. ``` **Option B: Chunked Extraction (Fallback)** If subagent unavailable, use `--format sentences` and manual chunking: 1. Fetch with sentences format: `get-youtube-transcript.py URL --format sentences > transcript.txt` 2. Read first 100 lines (intro, episode overview) 3. Read last 100 lines (conclusion, wrap-up) 4. Grep for key terms mentioned in intro 5. Extract quotes around grep matches with `-C 3` context **Benefits of Subagent Approach:** | Aspect | Direct Read | Subagent | |--------|-------------|----------| | Context usage | Fills main context with raw text | Returns only structured output | | Parallelism | Sequential processing | Runs alongside other agents | | Semantic analysis | Manual grep for terms | Agent identifies themes | | Output quality | May miss connections | Comprehensive extraction | ### Phase 3: Author Creation For external content, check if author exists: ```text Glob: content/authors/*{lastname}*.md ``` - **Match found:** Use existing slug - **Partial match:** Use AskUserQuestion to confirm identity - **No match:** Create new author per `references/author-creation.md` ### Phase 4: Content Generation 1. **Load writing-style skill** (REQUIRED): `Read .claude/skills/writing-style/SKILL.md` 2. **Load linking philosophy** (REQUIRED): `Read .claude/skills/adding-notes/references/linking-philosophy.md` 3. If `isTechnical`: collect code snippets from Phase 2 4. **Compile frontmatter** using template from content-type file 5. **Generate body** with wiki-links (see Phase 4.5 for connection discovery) **Tags:** 3-5 relevant tags. Use tags you've seen in prior notes or `Grep` for similar content to find existing tags. **Summary:** Frame as a core argument, not a description. What claim does this content make? ### Phase 4.25: Diagram Evaluation (REQUIRED) 1. Load `references/diagrams-guide.md` 2. Apply the decision tree based on content type priority 3. Log outcome (REQUIRED): - Adding: `✓ Diagram added: [mermaid-type] - [description]` - Skipping: `✓ No diagram needed: [specific reason]` ### Phase 4.5: Connection Discovery Load `references/linking-philosophy.md` and follow the discovery checklist: 1. **Same-author check** (highest priority): `Grep pattern: "authors:.*{author-slug}" glob: "content/*.md"` 2. **Tag-based discovery**: `Grep pattern: "tags:.*{tag}" glob: "content/*.md" limit: 5` 3. **Evaluate**: "Would I naturally reference this when discussing the topic?" Only add genuine connections with explanatory context. Orphans are acceptable. ### Phase 5: Quality Validation Spawn parallel validators: | Validator | Checks | |-----------|--------| | Wiki-link exists | Each `[[link]]` exists in `content/` (excluding Readwise) | | Link context | Each link has adjacent explanation (not bare "See also") | | Duplicate | Title/URL doesn't already exist | | Tag | Tags match or similar to existing | | Type-specific | E.g., podcast: profile exists, guest not in hosts | **Wiki-link note:** Readwise highlights (`content/readwise/`) are excluded from Nuxt Content and won't resolve as valid wiki-links. Use plain text or italics for books/articles that only exist in Readwise. **If issues found:** Use AskUserQuestion to offer: Fix issues / Save anyway / Cancel. **If no issues:** Log "✓ Validation passed" and proceed. ### Phase 6: Save Note Generate slug inline: lowercase title, replace spaces with hyphens, remove special characters. Example: `"Superhuman Is Built for Speed"` → `superhuman-is-built-for-speed` Save to `content/{slug}.md`. Confirm with link density status: ```text ✓ Note saved: content/{slug}.md - Type: {type} - Authors: {author-slugs} - Tags: {tag-count} tags - Diagram: {diagram-status} - Wiki-links: {link-count} connections ({status}) - [[link-1]] (why: {context}) - [[link-2]] (why: {context}) ``` **Diagram status:** `Added: [type] - [description]` or `None: [reason]` **Link density status:** - `{link-count} >= 3`: "well-connected" - `{link-count} = 1-2`: "connected" - `{link-count} = 0`: "standalone" (fine when no genuine connections exist) ### Phase 7: MOC Placement (Non-blocking) See `references/moc-placement.md` for detailed workflow: 1. Suggest existing MOC placement via cluster script 2. Check if any tag exceeds 15-note threshold for new MOC creation ### Phase 8: Quality Check Run linter and type check to catch any issues: ```bash pnpm lint:fix && pnpm typecheck ``` If errors are found, fix them before completing the task. --- ## Error Handling | Error | Recovery | |-------|----------| | Metadata agent fails | Prompt for manual entry or WebFetch fallback | | Transcript unavailable | Note "No transcript available" in body | | Transcript too large (>10K tokens) | Use Phase 2.5 subagent or chunked extraction | | Author not found online | Create minimal profile (name only) | | Reddit 429 | Wait 60s and retry | | Semantic analysis timeout | Proceed without wiki-link suggestions | | Validation crash | Warn user, recommend manual check | --- ## Reference Files | File | Purpose | |------|---------| | `references/author-creation.md` | Author profile workflow | | `references/diagrams-guide.md` | When/how to add mermaid diagrams | | `references/linking-philosophy.md` | Connection quality standards | | `references/moc-placement.md` | MOC suggestion and creation | | `references/code-extraction.md` | Technical content code snippets | | `references/podcast-profile-creation.md` | Podcast show profiles | | `references/newsletter-profile-creation.md` | Newsletter publication profiles | | `references/content-types/*.md` | Type-specific templates |