--- name: roam-transcript-sync description: Synchronize meeting transcripts from ro.am to Obsidian vault. Automatically downloads new meeting transcripts, classifies them as one-on-one or group meetings, routes them to appropriate directories, and processes them with meeting-memo skill. Use when the user asks to "sync ro.am meetings", "import meeting transcripts", "download new meetings from ro.am", or "update meeting notes from ro.am". Configurable for any Obsidian vault structure with PARA organization. args: - name: vault_path description: Absolute path to Obsidian vault root directory required: true example: ~/Documents/second-brain - name: owner_name description: Full name of vault owner for one-on-one detection required: true example: Joe Doe - name: people_dir description: Relative path from vault root to one-on-one meetings directory required: true default: 2-Areas/Meetings/People - name: teams_dir description: Relative path from vault root to team meetings directory required: true default: 2-Areas/Meetings/Teams - name: mapping_file description: Relative path from vault root to team mapping configuration required: true default: 2-Areas/Meetings/Teams/mapping.md - name: failures_file description: Relative path from vault root to sync failures log required: true default: 2-Areas/Meetings/_sync-failures.md - name: daily_notes_dir description: Relative path from vault root to daily notes directory (for meeting-memo) required: true default: DailyNotes --- # Ro.am Transcript Sync Synchronize meeting transcripts from ro.am to Obsidian vault with classification and routing. ## Workflow ### Step 0: Pre-flight Validation **MANDATORY FIRST STEP.** Run before any processing: ```bash python3 scripts/preflight.py \ --vault-path "$vault_path" \ --owner-name "$owner_name" \ --people-dir "$people_dir" \ --teams-dir "$teams_dir" \ --daily-notes-dir "$daily_notes_dir" \ --mapping-file "$mapping_file" ``` Exit code `0` = proceed. Exit code `1` = stop and fix reported issues. ### Step 1: Determine Sync Range ```bash python3 scripts/sync_state.py "$vault_path" since ``` Returns last sync timestamp, or 7 days ago if first run. ### Step 2: List and Filter Meetings Fetch meetings via ro.am MCP: ``` mcp__claude_ai_Ro_am__list_meetings({ "after": "", "expand": ["summary", "chapters", "participants"] }) ``` **CRITICAL:** Meeting objects use field `id`, NOT `savedMeetingId`. Pass `meeting['id']` as the `savedMeetingId` parameter in subsequent API calls. **Filter:** Only process meetings where `owner_name` appears in participants (case-insensitive). Skip meetings where the owner did not participate. ### Step 3: Deduplication Check For each meeting, check if transcript and memo files already exist: - Build expected filename: `{YYYY-MM-DD} {sanitized_title}.vtt` and `.md` - For 1:1 meetings: check `{people_dir}/{participant}/transcripts/` and `{people_dir}/{participant}/` - For group meetings: check all subdirectories under `{teams_dir}/` - Skip meetings where both transcript AND memo already exist ### Step 4: Phase 1 - Parallel Transcript Fetch Spawn one background agent per meeting using the Task tool with `run_in_background=true`. Each agent writes the transcript to a temp file. **Agent prompt template** (use exactly this pattern - minimal, MCP-only): ``` Fetch the transcript for ro.am meeting and save it to a file. 1. Call mcp__claude_ai_Ro_am__get_meeting_transcript with savedMeetingId: "{meeting_id}" 2. Write the FULL transcript text to: /tmp/roam_sync_{meeting_id}.txt RULES: - Do NOT check for ROAM_API_TOKEN or environment variables - Do NOT use Bash to check anything - Do NOT read any memory files or config files - ONLY call the MCP tool above, then write the result with the Write tool - The file must contain ONLY the transcript text, nothing else ``` **Why this specific pattern:** - Agents that read memory files or check env vars waste 1-2 min each - Agents that return data via TaskOutput produce JSONL logs, not clean text - Writing to explicit temp files lets the main agent read them directly After spawning all agents, continue to Phase 2 when they complete. Use `TaskOutput(task_id, block=True)` to wait for each agent, then read the transcript from `/tmp/roam_sync_{meeting_id}.txt`. ### Step 5: Phase 2 - Sequential Processing Main agent handles all classification and file operations sequentially. #### 5a. Load Team Mapping (once) Read `{vault_path}/{mapping_file}` and parse the markdown table: ``` | Team Name | Core Members | Detection Keywords | Priority Rules | ``` #### 5b. For Each Meeting (sequential): **i. Classify:** - 2 participants total = one-on-one, target = the other participant - 3+ participants = group meeting, match against team mappings by core members then keywords - No match = `Unclassified` **ii. Build paths:** - 1:1: `{vault_path}/{people_dir}/{sanitized_name}/` - Group: `{vault_path}/{teams_dir}/{team_name}/` - Create directory and `transcripts/` subdirectory **iii. Read transcript from temp file:** ``` Read /tmp/roam_sync_{meeting_id}.txt ``` **iv. Generate VTT and save:** ```python from scripts.generate_vtt import generate_vtt vtt_content = generate_vtt( transcript=transcript_text, start_time=meeting['start'], end_time=meeting['end'] ) ``` Save to `{target_dir}/transcripts/{date} {title}.vtt`. Fallback to `.txt` if VTT generation fails. **v. Invoke meeting-memo skill** (one at a time - prevents daily note race conditions): - Create temp file with transcript content - Invoke meeting-memo with `vault_path`, appropriate `meetings_dir` (people_dir or teams_dir), and `daily_notes_dir` - Clean up temp file after completion **vi. Clean up temp file:** Delete `/tmp/roam_sync_{meeting_id}.txt` after processing. ### Step 6: Update Sync State ```bash python3 scripts/sync_state.py "$vault_path" update ``` ### Step 7: Display Sync Summary ``` ================================================================ SYNC SUMMARY REPORT ================================================================ Total meetings found: {n} Owner participated in: {n} Already synced (skipped): {n} Newly processed: {n} Failed: {n} Phase 1 (parallel fetch): {n} seconds Phase 2 (sequential processing): {n} seconds Total sync time: {n} seconds ================================================================ ``` If failures occurred, list them and reference `{failures_file}`. ## Directory Structure ``` {vault_path}/2-Areas/Meetings/ People/{name}/ transcripts/{date} {title}.vtt {date} {title}.md Teams/{team}/ transcripts/{date} {title}.vtt {date} {title}.md ``` ## Common Errors **KeyError 'savedMeetingId':** Use `meeting['id']` from list_meetings, pass it as `savedMeetingId` parameter. **VTT generation failure:** Automatic fallback to plain text `.txt`. Check `scripts/generate_vtt.py` if ro.am transcript format changes. **Classification failure:** Falls back to `Unclassified` team. Check mapping.md format. ## Error Handling - Log failures to `{vault_path}/{failures_file}` with sanitized info (no absolute paths, no tokens) - Continue processing remaining meetings if one fails - Detailed debug log at `{vault_path}/.transcript-sync-debug.log` (permissions 0600) ## Scripts Reference | Script | Purpose | |--------|---------| | `scripts/preflight.py` | Pre-flight validation (deps, vault, permissions) | | `scripts/sync_state.py` | Sync timestamp management | | `scripts/generate_vtt.py` | Convert MCP transcript to VTT format | | `scripts/temp_file.py` | Secure temp file creation | | `scripts/failover_utils.py` | Retry logic, failure logging, state recovery | | `scripts/recover.py` | Recovery queue management | ## Security All external data must pass through `_shared/validation.py`: - `sanitize_participant_name()` / `sanitize_meeting_title()` for filenames - `validate_directory_safe()` to prevent path traversal - `sanitize_meeting_metadata()` before passing to meeting-memo - Never expose absolute paths in user-facing errors For full security documentation, see [ARCHITECTURE.md](ARCHITECTURE.md#security).