--- name: youtube-transcript description: Download YouTube video transcripts with automatic frame extraction for visual references. Use when analyzing YouTube videos, tutorials, or conference talks. type: command --- # YouTube Transcript Skill Download and analyze YouTube video transcripts with automatic frame extraction at visual reference points. ## Usage ``` /youtube-transcript ``` ## What This Skill Does 1. **Downloads transcript** with timestamps (YouTube auto-captions or manual subtitles) 2. **Detects visual references** in the text ("look at this diagram", "as you can see here") 3. **Extracts frames** at those timestamps for context 4. **Presents** transcript with embedded images for analysis ## Requirements ### macOS ```bash brew install yt-dlp ffmpeg ``` ### Linux (Ubuntu/Debian) ```bash sudo apt install ffmpeg pip install yt-dlp ``` ### Linux (Arch) ```bash sudo pacman -S yt-dlp ffmpeg ``` ## Instructions for Claude When the user invokes `/youtube-transcript `: ### Step 1: Check Dependencies ```bash command -v yt-dlp >/dev/null 2>&1 || echo "MISSING: yt-dlp" command -v ffmpeg >/dev/null 2>&1 || echo "MISSING: ffmpeg" ``` If missing, show installation instructions for the user's platform. ### Step 2: Create Output Directory Use the scratchpad directory for output: ```bash # Extract video ID from YouTube URL (POSIX-compatible, works on macOS and Linux) # Handles: youtube.com/watch?v=ID, youtu.be/ID, youtube.com/embed/ID URL="" VIDEO_ID=$(echo "$URL" | sed -E 's/.*[?&]v=([^&]+).*/\1/;s|.*/embed/([^?/]+).*|\1|;s|.*youtu\.be/([^?/]+).*|\1|') OUTPUT_DIR="/youtube-${VIDEO_ID}" mkdir -p "${OUTPUT_DIR}/frames" ``` ### Step 3: Download Transcript ```bash cd "${OUTPUT_DIR}" # Try auto-generated subtitles first, fall back to manual yt-dlp --write-auto-sub --sub-lang en,de --skip-download --convert-subs srt -o "transcript" "" 2>/dev/null || \ yt-dlp --write-sub --sub-lang en,de --skip-download --convert-subs srt -o "transcript" "" ``` ### Step 4: Analyze for Visual References Read the transcript and identify timestamps where visual content is referenced. Look for patterns: **German:** - "schau(t)? (mal )?(hier|das)" - "(dieses|das|dieser) (Diagramm|Bild|Schema|Chart|Graph|Screen|Slide)" - "wie (du|ihr|Sie) (hier )?(siehst|sehen)" - "auf (dem|diesem) (Bildschirm|Screen|Slide)" - "hier sehen wir" **English:** - "look at this" - "as you can see" - "this (diagram|chart|slide|screen|image)" - "let me show you" - "here we have" - "on the screen" ### Step 5: Download Video and Extract Frames Only if visual references were found: ```bash # First, list available formats to find a working one yt-dlp -F "" # Then download using a specific format ID (prefer combined formats like 18 for 360p) # Format 18 is usually 360p mp4 with video+audio combined - most reliable yt-dlp -f 18 -o "${OUTPUT_DIR}/video.mp4" "" # If format 18 not available, try other combined formats (22=720p, 18=360p) # Or use: yt-dlp -f "best[height<=480]" -o "${OUTPUT_DIR}/video.mp4" "" # Extract frame at timestamp (example: 01:23) ffmpeg -ss 00:01:23 -i "${OUTPUT_DIR}/video.mp4" -frames:v 1 -q:v 2 "${OUTPUT_DIR}/frames/01_23.jpg" ``` **Important:** Avoid `-f worst` or complex format selectors - they often hang due to yt-dlp JS runtime issues. Use explicit format IDs instead. Extract frames for each detected visual reference timestamp. ### Step 6: Present Results Create a summary with: 1. **Video metadata** (title, duration, channel) 2. **Full transcript** with timestamps 3. **Visual references** - show the extracted frames inline where referenced 4. **Ready for questions** - offer to analyze specific parts Example output format: ```markdown ## Video: [Title] **Channel:** [Channel Name] **Duration:** [Duration] --- ## Transcript [00:00] Introduction to the topic... [01:23] As you can see in this diagram... ![Frame at 01:23](frames/01_23.jpg) [02:45] Let's move on to the next point... --- ## Extracted Frames | Timestamp | Context | Frame | |-----------|---------|-------| | 01:23 | "this diagram shows..." | frames/01_23.jpg | --- Ready to answer questions about this video. ``` ## Cleanup The output directory in scratchpad will be automatically cleaned up. If the user wants to keep the files, they should copy them to their project. ## Limitations - Requires yt-dlp and ffmpeg installed locally - Some videos may not have transcripts available - Age-restricted or private videos may not be accessible - Very long videos (>2h) may take time to process ## Troubleshooting **No transcript found:** - Video may not have captions enabled - Try a different language: `--sub-lang en,de,es,fr` **Video download stuck at 0%:** - This is often caused by yt-dlp's format selection with JS runtime issues - Solution: List formats first (`yt-dlp -F `), then use explicit format ID (`yt-dlp -f 18 `) - Format 18 (360p mp4 combined) is usually the most reliable **JS Runtime Warning:** - yt-dlp may show "No supported JavaScript runtime" warning - This is usually fine for downloads, but can cause format selection issues - Install deno if needed: `brew install deno` or `curl -fsSL https://deno.land/install.sh | sh` **Connection timeouts:** - YouTube servers may be slow; retry with explicit format ID - Kill stuck process: `pkill -f "yt-dlp.*VIDEO_ID"` **yt-dlp errors:** - Update yt-dlp: `pip install -U yt-dlp` or `brew upgrade yt-dlp` **ffmpeg errors:** - Ensure ffmpeg is installed with video codecs