--- name: email-whiz description: >- Gmail copilot via MCP. Triage, inbox-zero, filters, analytics, labels, cleanup. Use when managing email or automating Gmail. NOT for composing emails, calendar, or non-Gmail. argument-hint: " [options]" model: opus disable-model-invocation: true allowed-tools: gmail_search_emails gmail_read_email gmail_list_email_labels gmail_create_label gmail_modify_email gmail_delete_email gmail_list_filters gmail_get_filter gmail_create_filter gmail_create_filter_from_template gmail_delete_filter gmail_batch_modify_emails gmail_batch_delete_emails gmail_get_or_create_label gmail_update_label gmail_delete_label gmail_send_email Read Grep Write context: fork license: MIT metadata: author: wyattowalsh version: "5.0.0" --- # Email Whiz Gmail inbox management via MCP. Parallel-first, large-inbox optimized. --- ## Canonical Vocabulary | Term | Definition | |------|------------| | **5-bucket** | DO / DELEGATE / DEFER / REFERENCE / NOISE — five triage buckets (historically "4D+N") | | **inbox zero** | State where inbox contains only items requiring action today | | **fast-lane** | Sender/subject-only classification without reading email content | | **security gate** | Subject-keyword scan of NOISE before archiving (catches 2FA, password resets) | | **session cache** | Local cache of Phase 0 results (1h TTL) at `~/.claude/email-whiz/session-cache.json` | | **tier** | Inbox size class: Small (<50) / Medium (50-500) / Large (500-5k) / Massive (5k+) | | **quick** | Modifier prefix that skips Phase 0 for lightweight modes | | **combo** | Chain two modes sharing discovered state: ` + ` | | **mega-wave** | Single message with 10-15 independent tool calls bundled together | | **baseline** | Rolling 30-day average inbox count; bankruptcy triggers are relative to this | | **auto-rule** | Gmail filter created from behavioral pattern analysis | | **VIP** | Sender who receives consistent replies and high-priority treatment | | **engagement** | Reply rate to a sender: replies sent / emails received | | **confidence** | HIGH >80% / MEDIUM 50-80% / LOW <50% — governs auto-rule actions | | **memory** | Persistent user preferences and patterns at `~/.claude/email-whiz/memory.json` | | **correction** | User disagreement with auto-classification, stored as a memory override | | **staleness** | Memory entry not confirmed/used beyond threshold (VIP: 90d, override: 60d) | --- ## Dispatch | `$ARGUMENTS` | Action | |--------------|--------| | `triage` | Categorize inbox with fast-lane + 5-bucket, batch process by category | | `inbox-zero` | Daily/weekly routine, progress tracking, baseline bankruptcy detection | | `filters` | Pattern-detected filter suggestions with confidence scoring | | `auto-rules` | Behavioral analysis → auto-create Gmail filters from templates | | `analytics` | Volume trends, response times, sender frequency, timing patterns | | `newsletters` | Subscription audit + unsubscribe plan | | `labels` | Taxonomy analysis, merge/rename/delete recommendations | | `search ` | Natural language → Gmail search query | | `senders` | VIP + noise identification | | `digest` | Recent important email summary | | `cleanup` | Archive candidates, duplicate detection, batch archiving | | `audit` | Full inbox health report with score and action plan | | `quick search` or `quick digest` | Skip Phase 0, run mode directly. Only search and digest are quick-compatible | | ` + ` | Chain modes, sharing discovered state (triage + filters) | | _(empty)_ | Auto-scan: run Phase 0 + account analysis, present prioritized action plan | **Classification gate:** If `$ARGUMENTS` could match multiple modes, ask which the user wants before proceeding. --- ## Hybrid Mode Protocol **Read operations — execute immediately:** `gmail_search_emails`, `gmail_read_email`, `gmail_list_email_labels`, `gmail_list_filters`, `gmail_get_filter` **Write operations — confirm before executing:** Single: `gmail_modify_email`, `gmail_delete_email`, `gmail_create_label` Batch: `gmail_batch_modify_emails`, `gmail_batch_delete_emails` Labels: `gmail_update_label`, `gmail_delete_label`, `gmail_get_or_create_label` Filters: `gmail_create_filter`, `gmail_create_filter_from_template`, `gmail_delete_filter` **Email deletion is FORBIDDEN unless the user explicitly requests it.** Never call `gmail_delete_email` or `gmail_batch_delete_emails` on your own initiative. Always archive instead. Only use delete tools when the user directly says "delete" — and even then, use the Destructive Warning template (TYPE "DELETE" confirmation) first. **Confirmation format:** See `templates.md`. Always show scope, count, and reversibility before write ops. --- ## Parallelization Rules 1. **Independent tool calls go in ONE message.** Sequential independent calls is a bug. 2. **Phase 0 fuses with mode's first queries** — never 2 sequential rounds when fusion is possible. 3. **Batch ops on disjoint ID sets: parallel.** NOISE + REFERENCE + DEFER = 1 message, 3 calls. 4. **Analytics/Audit: mega-wave** — 10-15 search queries in 1 message. 5. **gmail_read_email: batch 5-8 per message** (full messages are large). 6. **Filter creation: parallel** after conflict check passes. 7. **Combo mode boundary: sequential** (data dependency). But skip Phase 0 for mode 2. 8. **Max 15-20 calls per message.** Beyond that, a single 429 impacts the entire bundle. 9. **On 429:** preserve completed results, drop to sequential with 2s gaps. 3x429 in 60s → pause 60s. Details: `efficiency-guide.md` § Parallel Call Map. --- ## Phase -1: Memory Load (skip with `quick` prefix) Load long-term user memory before discovery. Runs once per session. `!uv run python scripts/memory.py load --mode ` Fuse with Phase 0 cache check in the SAME message (both are local file reads, zero API calls): ```bash !uv run python scripts/memory.py load --mode triage !uv run python scripts/inbox_snapshot.py cache load ``` If memory exists, integrate into session: - **VIP senders** → fast-lane DO classification (Pass 1) - **Noise senders** → fast-lane NOISE classification (Pass 1) - **Corrections** → highest priority: override any conflicting fast-lane rule - **Triage overrides** → apply before generic decision tree - **Filter history** → skip suggesting previously rejected filters - **Label prefs** → use preferred naming when creating labels - **Inbox patterns** → adjust expectations in analytics/audit Run `!uv run python scripts/memory.py prune` once per session if memory exists. Report pruned entries only if count > 0. --- ## Phase 0: Discovery (skip with `quick` prefix) **Prerequisite:** Run Phase -1 (Memory Load) first unless in `quick` mode. ### 0a: Session Cache Check `!uv run python scripts/inbox_snapshot.py cache load` If valid (< 1h old) → use cached labels, filters, tier. Jump to mode. ### 0b: Parallel Discovery + Mode Fusion (cache miss) Issue Phase 0 calls AND mode's first queries in a SINGLE message: Core discovery (always): 1. `gmail_list_email_labels` → label name/ID map + INBOX `messagesTotal`/`messagesUnread` 2. `gmail_list_filters` → existing filter criteria 3. `gmail_search_emails` → `"in:inbox is:unread"` (maxResults: 10) → sample Mode-specific additions (same message): - Triage: + `gmail_search_emails "is:unread"` (maxResults: per tier) - Inbox Zero: + `"is:unread newer_than:1d"` + `"label:_deferred"` - Analytics: + 4x weekly volume queries (mega-wave) - Audit: + newsletter count + sent count + overdue count (mega-wave) - Cleanup: + 4x stale-detection queries ### 0c: Tier Assessment Use INBOX label `messagesTotal` (NOT `resultSizeEstimate` — it is unreliable). | Inbox Size | Tier | maxResults | Strategy | |------------|------|------------|----------| | <50 | Small | 50 | Process all | | 50-500 | Medium | 100 | Sample recent + unread | | 500-5000 | Large | 200 | Importance-first, date-range splitting | | 5000+ | Massive | 200 | Date-range parallel queries, baseline bankruptcy | **Why maxResults 100-200:** MCP server has N+1 fetch pattern (each result triggers a separate API call). maxResults=500 ≈ 501 API calls ≈ 2505 quota units. 200 ≈ 1005 units. **Massive tier:** Split by date-range (30d chunks), issue ALL chunks in parallel. Always query `is:important is:unread` alongside general scan. No `pageToken` on MCP server — date-range splitting is the ONLY pagination. ### 0d: Write Session Cache `!uv run python scripts/inbox_snapshot.py cache save --labels '...' --filters '...' --inbox-count N --unread-count M --tier TIER` **Invalidate cache after any write operation** (filter/label create, batch modify). Use discovered label names throughout — never invent names that conflict with existing taxonomy. --- ## Auto-Scan (empty invocation) When invoked with no arguments, perform an intelligent account scan instead of showing a menu. ### Step 1: Mega-Wave Discovery (1 message) Fuse Phase 0 core discovery with 6 analysis queries in a SINGLE message (~10 calls): - Phase 0 core: `gmail_list_email_labels` + `gmail_list_filters` + `gmail_search_emails "in:inbox is:unread"` (maxResults: 10) - Inbox reach: `gmail_search_emails "in:inbox has:nouserlabels newer_than:7d"` (maxResults: 10) - Newsletters: `gmail_search_emails "in:inbox (list:* OR subject:newsletter)"` (maxResults: 50) - Stale deferred: `gmail_search_emails "label:_deferred older_than:7d"` (maxResults: 10) - Backlog: `gmail_search_emails "in:inbox is:unread older_than:7d"` (maxResults: 10) - Cleanup candidates: `gmail_search_emails "in:inbox (subject:receipt OR subject:confirmation) older_than:30d"` (maxResults: 10) - `!uv run python scripts/inbox_snapshot.py trend --days 7` - `!uv run python scripts/inbox_snapshot.py baseline --days 30` ### Step 2: Analyze + Score From results, compute: tier (from INBOX messagesTotal), bankruptcy level (baseline comparison), triage need (unread backlog), filter opportunity (noise sender clusters), newsletter burden, cleanup potential, deferred pile health. ### Step 3: Present Action Plan Use the Auto-Scan Report template (`templates.md` § Auto-Scan Report). Present 3-5 prioritized recommendations with estimated impact. Each maps to a mode. Offer: reply with number, mode name, `all` (chain top 3 sequentially sharing Phase 0 state), or `menu` (show full mode list). --- ## Orchestration Patterns This skill runs in `context: fork`. Apply these patterns internally: ### Decomposition Gate (before every mode) 1. List all tool calls needed for the mode 2. Classify: independent (different queries) vs dependent (needs prior results) 3. Bundle ALL independent calls into the fewest possible messages 4. Never execute independent calls sequentially ### Mode-Specific Wave Patterns | Mode | Wave 1 | Wave 2 | Wave 3 | Total | |------|--------|--------|--------|-------| | Auto-Scan | Phase 0 + 6 analysis (~10) | scoring (0 tools) | — | ~10 | | Triage | Phase 0 + fast-lane (~5) | 3x batch_modify | 5-8x read_email | ~16 | | Analytics | Phase 0 + mega-wave (~15) | per-entity batches (~12) | — | ~27 | | Audit | Phase 0 + mega-wave (~14) | per-entity batches (~20) | — | ~34 | | Cleanup | Phase 0 + 4 stale queries (~7) | batch_modify (~3) | — | ~10 | | Filters | Phase 0 + sender/subject (~6) | create_filter (~5) | — | ~11 | | Inbox Zero | Phase 0 + unread/deferred (~5) | batch ops (~3) | — | ~8 | | Combo | Phase 0 + mode 1 | mode 1 ops | refresh + mode 2 | varies | --- ## Mode: Triage Three-pass classification. Details: `triage-framework.md` § Fast-Lane, `efficiency-guide.md`. ### Pass 1: Fast-Lane (zero content reads) Classify by sender + subject + snippet ONLY from search results: - Memory corrections → override any conflicting rule (highest priority) - Memory VIP senders → DO (confidence from memory) - Memory noise senders → NOISE (confidence from memory) - `noreply@*`, `notifications@*`, `calendar-notification@*` → NOISE - Known VIP senders (from session cache) → DO - Subject `[FYI]`/`[INFO]`/newsletter/receipt/confirmation → REFERENCE - Subject `[ACTION]`/`[URGENT]`/deadline → DO - Unmatched → UNDECIDED ### Pass 1.5: Security Gate (mandatory before archiving NOISE) Scan all NOISE subjects/snippets for security keywords: `password reset`, `verify your`, `sign-in`, `unusual activity`, `2FA`, `account locked`, `verification code`, `one-time password`, `security alert`, `new device`, `suspicious login`, `recovery phone`, `two-step verification`, `confirm your identity`, `new sign-in method`. Any match → pull from NOISE to REVIEW. Also: any `noreply@` email `newer_than:7d` → REVIEW (login notifications have no expiry; codes expire but should still surface). ### Pass 2: Batch Operations (all in ONE message) **Precondition:** Call `gmail_get_or_create_label` for `_reference` and `_deferred` if not already resolved in Phase 0. Issue 3 `gmail_batch_modify_emails` calls simultaneously (disjoint ID sets = parallel safe): - NOISE ids → remove INBOX - REFERENCE ids → add `_reference`, remove INBOX - DEFER ids → add `_deferred`, remove INBOX ### Pass 3: Content Inspection (UNDECIDED + REVIEW only) Batch `gmail_read_email` 5-8 per message. Apply full 5-bucket. Batch-process results with same parallel pattern as Pass 2. Report with fast-lane triage summary (show auto-classified %). Surface recurring NOISE sender patterns as filter candidates. --- ## Mode: Inbox Zero Structured habit system. Details: `inbox-zero-system.md`. **Daily (5 min):** Fuse Phase 0 + `is:unread newer_than:1d` + `label:_deferred` in 1 message. Quick triage pass (5-bucket). Batch archive NOISE + REFERENCE. Report: inbox count before → after. Update progress file. **Weekly (15 min):** Issue 3 queries in 1 message: deferred sweep + filter effectiveness + newsletter sweep. Clear, re-defer, or archive. Update weekly_reviews. **Bankruptcy detection:** Relative to 30-day baseline (not absolute). YELLOW: 20-50% above baseline. ORANGE: 50-100% above baseline (aggressive triage + filter creation). RED: >100% above baseline (full bankruptcy protocol). Acceleration trigger: week-over-week growth >20% for 2+ weeks. New users default to 500 until 7+ snapshots establish a baseline. See `inbox-zero-system.md` for recovery protocols. --- ## Mode: Filters Pattern-detected filter suggestions. Details: `filter-patterns.md`. Issue sender clustering + subject pattern + `gmail_list_filters` ALL in 1 message. Score candidates: HIGH (volume ≥20, engagement <5%) → offer create. MEDIUM (≥10, <15%) → suggest. LOW → show only. Check conflicts before creating. After approval: issue multiple `gmail_create_filter` calls in 1 message. Massive tier: split 90d into 3x30d chunks, query in parallel. --- ## Mode: Auto-Rules Deep behavioral analysis → automated filter creation. Details: `filter-patterns.md` § Auto-Rule Detection. Issue sender analysis + subject mining + filter fetch in 1 message. Massive tier: split 90d into 3x30d date-range chunks, issue ALL in parallel. Score by confidence. Present auto-rules report. HIGH: `gmail_create_filter_from_template` with confirmation. MEDIUM: individual review. Creation: multiple filters in 1 message. Schedule learning loop after 2 weeks. --- ## Mode: Analytics Email communication metrics via mega-wave. Details: `analytics-guide.md`. Issue ALL queries in 1 message: 4x weekly volume + sent count + inbox reach + reply rate + overdue + sender frequency + label distribution (~12 calls, ~65% message reduction). Classify volume trend (growing/stable/declining). Compute inbox reach rate (good: <30%). Identify top 20 senders by volume + engagement matrix. Present full analytics report. --- ## Mode: Newsletters Subscription audit. Issue `list:*` + `subject:newsletter` in 1 message. Estimate frequency + read rate per subscription. Classify: UNSUBSCRIBE / KEEP / FILTER. Surface unsubscribe links. On approval: create filter or provide unsubscribe URL. Details: `workflows.md` § Unsubscribe Strategies. --- ## Mode: Labels Label taxonomy analysis. `gmail_list_email_labels` returns `messagesTotal`/`messagesUnread` per label. For 30d activity classification, issue per-label search queries in batches of 10-15. Find: duplicates/synonyms, stale labels (no messages 90d+), flat structure needing hierarchy. Suggest: MERGE / RENAME / DELETE / RESTRUCTURE. For merges: `gmail_batch_modify_emails` to migrate, then `gmail_delete_label`. Details: `workflows.md` § Label Hierarchy. --- ## Mode: Search Quick-compatible (no Phase 0). Natural language → Gmail search query. Parse intent (who/what/when/status), map to operators, add negations, show query + explanation + expected matches. Offer to run. Details: `filter-patterns.md` § Gmail Search Operators. --- ## Mode: Senders VIP + noise identification. Issue sample search + per-sender reply-rate queries in 1 message (if top-sender count > 12, split reply-rate queries across 2 messages to stay under call cap). VIP: reply rate >50% or consistently starred. Noise: volume >10, engagement 0%. Present sender report. On approval: create VIP filter (mark important) or noise filter. Details: `triage-framework.md` § Sender Shortcuts, `workflows.md` § VIP Management. --- ## Mode: Digest Quick-compatible (no Phase 0). Search `is:important newer_than:3d` (or `is:starred OR is:unread is:important`). Extract sender, subject, 1-line summary per email. Categorize: RESPOND / FYI / READING. Present digest report. Offer: "open N" | "archive fyi". Details: `templates.md` § Digest. --- ## Mode: Cleanup Archive candidates. Issue ALL 4 stale-detection queries in 1 message: receipts/confirmations `older_than:30d` + old newsletters `older_than:30d` + expired deferrals `older_than:14d` + fully-read no-label `older_than:60d`. Batch-archive ALL results in parallel `gmail_batch_modify_emails` calls. Archive only — never delete. Details: `workflows.md` § Batch Operations. --- ## Mode: Audit Full inbox health via mega-wave. Issue 11 base queries in 1 message (per-label and per-sender queries run as a second wave): inbox count, unread %, filter count, inbox reach, reply rate, overdue, newsletter volume, deferred pile, sent volume, label health, sender top-20 (~70% message reduction). Score 0-100 based on inbox reach rate, filter coverage, label health. Identify top 3 quick wins. Generate action plan. Details: `analytics-guide.md`, `templates.md` § Full Audit. --- ## Scope Boundaries **IS for:** Gmail triage, inbox zero, filters, auto-rules, analytics, newsletters, labels, search, senders, digest, cleanup, audit. **NOT for:** Calendar, Google Drive, non-Gmail, real-time monitoring, CRM. --- ## Reference File Index | File | Content | Load When | |------|---------|-----------| | `references/efficiency-guide.md` | Cache, parallel calls, tiers, MCP constraints, fast-lane, pagination | All modes | | `references/triage-framework.md` | 5-bucket decision trees, fast-lane rules, security gate, batch processing | triage, inbox-zero | | `references/filter-patterns.md` | Gmail operators, filter templates, auto-rule algorithms, learning loop | filters, auto-rules, search | | `references/workflows.md` | Bankruptcy, VIP, batch ops, label hierarchy, unsubscribe, combo mode | cleanup, labels, senders, newsletters | | `references/templates.md` | All confirmation, report, and execution result templates | Every mode | | `references/inbox-zero-system.md` | Daily/weekly routines, progress schema, baseline bankruptcy | inbox-zero | | `references/analytics-guide.md` | Volume, response time, sender frequency, timing methodology | analytics, audit | | `references/tool-reference.md` | All 18 Gmail MCP tool signatures, parameters, MCP constraints | When selecting tools | **Scripts:** | Script | Purpose | Run When | |--------|---------|----------| | `scripts/inbox_snapshot.py save --inbox-count N` | Save inbox count snapshot | After each inbox-zero check | | `scripts/inbox_snapshot.py trend --days 7` | Compute inbox trend | During analytics mode | | `scripts/inbox_snapshot.py baseline --days 30` | Compute rolling baseline for bankruptcy | During inbox-zero | | `scripts/inbox_snapshot.py cache load` | Load session cache (1h TTL) | Phase 0 step 0a | | `scripts/inbox_snapshot.py cache save --labels --filters --inbox-count --unread-count --tier` | Write session cache | Phase 0 step 0d | | `scripts/inbox_snapshot.py cache clear` | Invalidate session cache | After write operations | | `scripts/memory.py load --mode MODE` | Load user memory filtered by mode | Phase -1 | | `scripts/memory.py save-sender` | Save VIP or noise sender | After triage/senders | | `scripts/memory.py save-correction` | Record user classification override | When user corrects | | `scripts/memory.py prune` | Remove stale memory entries | Once per session | | `scripts/memory.py stats` | Memory summary counts | Diagnostics | --- ## Critical Rules 1. Run Phase 0 discovery before any workflow — never assume label/filter state (unless `quick` mode) 2. Confirm all write operations — show scope, count, and reversibility before calling 3. Batch similar operations — never loop `gmail_modify_email` when `gmail_batch_modify_emails` applies 4. Preserve existing taxonomy — suggest improvements, never silently rename 5. Check filter conflicts before creating — call `gmail_list_filters` first 6. Include confidence scores on all filter/rule suggestions 7. Provide rollback guidance for every bulk operation 8. Handle MCP failures gracefully — see `efficiency-guide.md` § Error Recovery 9. Surface unsubscribe links — make newsletter cleanup actionable 10. Prioritize quick wins — high impact, low effort first in all reports 11. Update inbox-zero progress after every routine 12. NEVER delete emails unless the user explicitly requests deletion. Archive by default in all modes. When the user does request deletion, use the Destructive Warning template (TYPE "DELETE" confirmation) before calling `gmail_delete_email` or `gmail_batch_delete_emails` 13. Issue ALL independent tool calls in a single message — sequential independent calls is a bug 14. Use fast-lane classification before reading content — sender + subject resolve 60%+ of triage 15. ALWAYS run the security gate on NOISE before archiving — scan subjects for 2FA/password/security keywords 16. Check session cache before Phase 0. Invalidate cache after any write operation (`gmail_create_label`, `gmail_get_or_create_label`, `gmail_create_filter`, `gmail_create_filter_from_template`, `gmail_batch_modify_emails`, `gmail_batch_delete_emails`). 17. maxResults 100-200 (not 500) — MCP N+1 pattern means 500 results ≈ 501 API calls 18. Bankruptcy triggers are relative to 30-day baseline, not absolute thresholds 19. Load user memory before Phase 0 (Phase -1). Save memories AFTER delivering primary output — never block a mode's core workflow for a memory write --- ## Memory Save Triggers Save memories at natural decision points. NEVER slow down a mode to save — always save AFTER the mode's primary output. | Trigger | Command | | ------- | ------- | | Triage: user overrides fast-lane classification | `memory.py save-correction --mode triage --what "..." --correction "..." --pattern "..." --action BUCKET` | | Triage: NOISE batch archived successfully | `memory.py save-sender --email X --type noise --reason "..." --source triage` (batch multiple in 1 message) | | Senders: VIP identified (reply rate >50%) | `memory.py save-sender --email X --type vip --name "..." --reason "..." --source senders` | | Filters: filter created successfully | `memory.py save-filter --filter-id ID --description "..." --effectiveness high` | | Filters: user rejects a suggestion | `memory.py save-filter --filter-id rejected --description "..." --effectiveness failed` | | Labels: user states naming preference | `memory.py save-labels --convention "..." --structure "..."` | | Analytics: volume patterns computed | `memory.py save-patterns --daily-volume N --busy '[...]'` | | Any mode: user corrects a classification | `memory.py save-correction ...` | Batch multiple save calls in a SINGLE message when a mode produces several memories (e.g., triage identifies 5 noise senders)