--- name: multi-ai-research description: Comprehensive research and analysis using Claude (subagents), Gemini CLI, and Codex CLI. Multi-perspective research with cross-verification, iterative refinement, and 100% citation coverage. Use for security analysis, architecture research, code quality assessment, performance analysis, or any research requiring rigorous verification and multiple AI perspectives. allowed-tools: Task, Bash, Read, Grep, Glob, Write --- # Multi-AI Research & Analysis ## Overview Harnesses three AI systems (Claude via Task tool, Gemini CLI, Codex CLI) for comprehensive research and analysis with multi-perspective verification and iterative refinement. **Purpose**: Produce analysis more thorough than any single AI could achieve through specialized roles, cross-validation, and systematic verification. **Key Innovation**: Not just parallel execution - specialized research roles with cross-verification and iterative refinement until production-ready (quality ≥95/100, 100% citations, zero gaps). **The 3 AI Systems**: 1. **Claude Subagents** (via Task tool) - Documentation, codebase analysis, synthesis 2. **Gemini CLI** - Web research, latest trends, community practices 3. **Codex CLI** - GitHub patterns, code examples, deep reasoning **Quality Guarantees**: - ✓ **100% coverage** - All objectives addressed, zero gaps - ✓ **100% citations** - Every claim sourced (file:line or URL) - ✓ **Multi-perspective** - 3 AI systems cross-validated - ✓ **≥95/100 quality** - Verified through 3-pass system - ✓ **Actionable** - Specific recommendations with examples - ✓ **Resumable** - External memory enables multi-session work --- ## When to Use Use this skill for: ### Security Analysis - Authentication/authorization assessment - Vulnerability identification - Best practice validation - OWASP Top 10 coverage - Penetration testing preparation ### Architecture Analysis - System design review - Component mapping - Integration pattern analysis - Scalability assessment - Technical debt evaluation ### Code Quality Analysis - Pattern detection - Code smell identification - Complexity metrics - Refactoring opportunities - Best practice adherence ### Performance Analysis - Bottleneck identification - Algorithm complexity - Resource usage patterns - Optimization opportunities - Benchmark analysis ### Research Synthesis - Multi-source research compilation - Best practice identification - Technology evaluation - Pattern discovery - Trend analysis ### Comprehensive Reviews - Pre-production audit - System health check - Compliance verification - Documentation audit - Knowledge transfer --- ## Quick Start ### Option 1: Automated Script ```bash # Run complete analysis automatically bash .claude/skills/multi-ai-research/scripts/analyze.sh "Security analysis of authentication system" ``` This will: 1. Create analysis plan 2. Launch parallel research (Claude + Gemini + Codex) 3. Perform deep analysis 4. Synthesize and verify 5. Iterate if needed 6. Generate final report ### Option 2: Interactive Mode Ask Claude Code to use this skill: ``` "Use multi-ai-research to analyze [objective]" ``` Claude will: 1. Create comprehensive analysis plan 2. Coordinate all three AI systems 3. Synthesize findings 4. Verify quality 5. Iterate until ≥95 quality 6. Deliver final report --- ## The 5-Phase Pipeline ### Phase 1: Planning & Strategy **Duration**: 5-10 minutes **Output**: `.analysis/ANALYSIS_PLAN.md` Claude creates comprehensive plan: - Defines objectives and scope - Plans file reading strategy (glob → grep → read) - Assigns tasks to AI systems - Sets verification criteria - Defines success thresholds ### Phase 2: Parallel Research **Duration**: 10-20 minutes **Output**: `.analysis/research/*.md` All three systems research simultaneously: **Claude Subagent**: - Official documentation analysis - Codebase examination (progressive disclosure) - Architecture mapping - Pattern identification **Gemini CLI**: - Web research (latest 2024-2025) - Community best practices - Industry trends - Common pitfalls **Codex CLI**: - GitHub pattern analysis - Code examples from top repos - Implementation references - Testing strategies ### Phase 3: Deep Analysis **Duration**: 15-30 minutes **Output**: `.analysis/analysis/code-patterns.md` Claude Analysis Agent with extended thinking: - Progressive codebase analysis - Pattern recognition across sources - Architecture mapping - Metrics calculation - Risk assessment ### Phase 4: Synthesis & Verification **Duration**: 10-20 minutes **Outputs**: - `.analysis/SYNTHESIS_REPORT.md` - `.analysis/verification/cross-check.md` **Synthesis** (Claude with extended thinking): - Read all research findings - Identify themes across sources - Resolve contradictions - Create unified narrative - Full citations **Verification** (Verification Subagent): - 3-pass verification (completeness, accuracy, quality) - Cross-source validation - Citation checking - Gap analysis - Quality scoring ### Phase 5: Iteration (if needed) **Duration**: 10-30 minutes **Output**: `.analysis/iterations/ITERATION_2.md` If quality <95 or gaps exist: - Targeted research for gaps - Quality improvements - Re-verification - Repeat until ≥95 ### Phase 6: Final Report **Duration**: 5-10 minutes **Output**: `.analysis/ANALYSIS_FINAL.md` Comprehensive final report: - Executive summary - Complete findings - All sources synthesized - Prioritized recommendations - Implementation guidance - Full citations **Total Time**: 45-90 minutes for comprehensive analysis --- ## Analysis Types ### Security Analysis **What it checks**: - Authentication/authorization patterns - Input validation - Secret management - Injection vulnerabilities (SQL, XSS, etc.) - Dependency vulnerabilities - Rate limiting - Session security **Example**: ``` Use multi-ai-research for "Security audit of authentication system" ``` **Output**: - Critical/High/Medium/Low priority issues - OWASP Top 10 coverage - Code examples with file:line - Specific remediation steps - Industry best practices comparison ### Architecture Analysis **What it examines**: - System components and boundaries - Integration patterns - Data flow - Dependency relationships - Scalability considerations - Design patterns used **Example**: ``` Use multi-ai-research for "Architecture analysis of microservices system" ``` **Output**: - Component map with relationships - Integration pattern analysis - Scalability assessment - Technical debt identification - Refactoring recommendations ### Code Quality Analysis **What it analyzes**: - Code patterns and organization - Complexity metrics - Code smells - Best practice adherence - Test coverage - Documentation quality **Example**: ``` Use multi-ai-research for "Code quality assessment for ./src" ``` **Output**: - Quality score with breakdown - Pattern analysis - Refactoring priorities - Specific code improvements - Complexity hotspots ### Performance Analysis **What it identifies**: - Algorithm complexity - Bottlenecks - Resource usage patterns - Database query efficiency - Network call patterns **Example**: ``` Use multi-ai-research for "Performance bottleneck identification" ``` **Output**: - Bottleneck analysis with file:line - Optimization opportunities - Before/after estimations - Implementation guidance ### Research Synthesis **What it compiles**: - Official documentation - Web best practices - GitHub patterns - Industry standards - Community insights **Example**: ``` Use multi-ai-research for "Research GraphQL federation patterns 2024-2025" ``` **Output**: - Multi-source synthesis - Consensus findings (all sources agree) - Multiple perspectives (sources differ) - Code examples - Implementation recommendations --- ## How It Works ### Progressive Disclosure **Never reads files blindly**. Always uses 3-level approach: **Level 1: Metadata (glob)** - ~50 tokens ```bash glob "**/*.{ts,js,py}" # Understand structure glob "**/*.md" # Find documentation glob "**/package.json" # Check dependencies ``` **Level 2: Patterns (grep)** - ~5k tokens ```bash grep "export class|interface" --glob "**/*.ts" grep "TODO|FIXME|BUG" --glob "**/*" grep "password|secret|token" --glob "**/*.ts" ``` **Level 3: Reading (read)** - ~50k tokens ```bash read "src/auth/login.ts" # Only critical files read "docs/architecture.md" ``` **Result**: 90%+ reduction in unnecessary file reads ### External Memory Architecture All state saved to files, not context: ``` .analysis/ ├── ANALYSIS_PLAN.md # Strategy and assignments ├── research/ │ ├── claude-docs.md # Claude research │ ├── gemini-web.md # Gemini research │ └── codex-github.md # Codex research ├── analysis/ │ ├── code-patterns.md # Pattern analysis │ └── architecture-map.md # System map ├── verification/ │ └── cross-check.md # Verification results ├── iterations/ │ ├── ITERATION_1.md # First pass │ └── ITERATION_2.md # Gap fills └── ANALYSIS_FINAL.md # Complete report ``` **Benefits**: - Survives context window limits - Enables multi-session analysis - Resumable from any checkpoint - No information loss ### Cross-Validation Pattern **High Confidence** (★★★★★): All 3 sources agree + code verification **Medium Confidence** (★★★☆☆): 2/3 sources agree **Requires Investigation** (★★☆☆☆): Sources conflict **Example**: ```markdown ## JWT Implementation (High Confidence ★★★★★) **Claude**: "Uses JWT with HS256" (src/auth/jwt.ts:15) **Gemini**: "HS256 is industry standard 2024" (URL) **Codex**: "150+ repos use HS256 pattern" (GitHub) **Code**: Verified at src/auth/jwt.ts:18-22 **Recommendation**: Implementation correct per standards ``` ### Quality Scoring **Comprehensive rubric** (0-100): - **Comprehensiveness** (/20): All aspects covered - **Accuracy** (/20): All claims sourced and verified - **Specificity** (/20): File:line precision, not vague - **Actionability** (/20): Specific recommendations - **Consistency** (/20): No contradictions **Quality Gates**: - ≥95: Production-ready - 85-94: Needs minor refinement - 75-84: Needs iteration - <75: Requires rework ### Iterative Refinement **Iteration 1** (Breadth): Broad coverage, identifies gaps **Iteration 2** (Depth): Fill gaps, improve quality **Iteration 3** (Polish): Final verification, perfection **Automatic iteration** until: - Quality ≥95 - Citation coverage = 100% - Critical gaps = 0 --- ## AI System Roles ### Claude Subagents (via Task tool) **Research Agent** (Haiku): - Progressive disclosure expert - Documentation analysis - Codebase examination - Pattern detection **Analysis Agent** (Sonnet): - Extended thinking for synthesis - Multi-source integration - Pattern recognition - Architectural insights **Verification Agent** (Haiku): - 3-pass verification - Citation checking - Gap analysis - Quality scoring ### Gemini CLI **Strengths**: - Native web search - Latest trends (2024-2025) - Community practices - Multimodal analysis (if needed) **Use for**: - Best practice research - Industry standards - Latest vulnerabilities - Framework comparisons ### Codex CLI **Strengths**: - GitHub integration - Code pattern search - Deep reasoning (o3 model) - Implementation examples **Use for**: - Code examples - Design patterns - Architecture reasoning - Testing strategies --- ## Configuration ### Prerequisites **Required**: - Claude Code (with Task tool access) **Optional but Recommended**: - Gemini CLI: `npm install -g @google/gemini-cli` - Codex CLI: `npm install -g @openai/codex` **Note**: Skill works with Claude-only fallback if Gemini/Codex unavailable. ### Gemini CLI Setup ```bash # Install npm install -g @google/gemini-cli # Authenticate (OAuth - free) gemini # Follow browser authentication # Test gemini -p "test prompt" ``` ### Codex CLI Setup ```bash # Install npm install -g @openai/codex # Authenticate (ChatGPT Plus/Pro account) codex login # Follow browser authentication # Test codex exec "test prompt" ``` ### Model Selection **Claude**: - Haiku: Research & verification (fast, efficient) - Sonnet: Analysis & synthesis (balanced) - Opus: Complex reasoning (if needed) **Gemini**: - gemini-2.5-flash: Quick research - gemini-2.5-pro: Complex analysis **Codex**: - gpt-5.1-codex: Standard tasks - o3: Deep architectural reasoning - o4-mini: Quick operations --- ## Examples ### Example 1: Security Analysis ``` Objective: "Security audit of authentication system" Phase 2 - Parallel Research: ├─ Claude: Analyzes src/auth/* for patterns ├─ Gemini: Researches "OAuth 2.0 security best practices 2024" └─ Codex: Finds GitHub examples of secure auth Phase 3 - Analysis: └─ Claude: Identifies 3 critical, 5 high priority issues Phase 4 - Synthesis: └─ All agree: Missing rate limiting (CRITICAL) - Claude: No rate limit found in src/auth/login.ts - Gemini: OWASP recommends max 5 attempts/hour - Codex: 150+ repos use express-rate-limit - Recommendation: Implement with Redis backend Final Report: ├─ Executive summary ├─ 8 issues (3 critical, 5 high) with fixes ├─ OWASP Top 10 coverage ├─ Specific code examples └─ Priority implementation plan Quality: 97/100 ✓ ``` ### Example 2: Architecture Analysis ``` Objective: "Analyze microservices architecture" Phase 2: ├─ Claude: Maps services via glob + grep ├─ Gemini: Researches microservices patterns 2024 └─ Codex: Finds service mesh examples Phase 3: └─ Claude: Identifies 7 services, 12 integration points Phase 4: └─ Synthesis: Service communication patterns - Consensus: REST for external, gRPC for internal - Trade-offs documented - Scaling strategies from Codex examples Final Report: ├─ Component map (7 services, dependencies) ├─ Integration analysis (12 patterns) ├─ Scalability assessment └─ Modernization recommendations Quality: 96/100 ✓ ``` ### Example 3: Research Synthesis ``` Objective: "Research state management patterns for React 2024" Phase 2: ├─ Claude: Reviews React docs + examples ├─ Gemini: Web research "React state management 2024" └─ Codex: Analyzes top 50 React repos Phase 3: └─ Pattern analysis: 5 major approaches identified Phase 4: └─ Synthesis by use case: - Small apps: Context (all sources agree) - Medium apps: Zustand (Gemini + Codex recommend) - Large apps: Redux Toolkit (battle-tested, Codex data) - Server state: TanStack Query (trending, Gemini research) Final Report: ├─ Decision tree by project size ├─ Pros/cons with sources ├─ Migration strategies └─ Code examples from Codex Quality: 98/100 ✓ ``` --- ## Best Practices ### 1. Be Specific with Objectives ``` ❌ "Analyze the code" ✅ "Security analysis of authentication module for OWASP Top 10 compliance" ``` ### 2. Trust the Verification Multi-pass verification catches issues. If quality <95, iteration happens automatically. ### 3. Review External Memory Check `.analysis/` folder during execution to see progress. ### 4. Leverage Citations Every claim has file:line or URL. Use for validation and deep dives. ### 5. Multi-Session Projects Large projects can span sessions: ``` Session 1: Initial analysis → ITERATION_1.md Session 2: Gap filling → ITERATION_2.md Session 3: Final polish → ANALYSIS_FINAL.md ``` ### 6. Check All Three Perspectives High-value insights often come from comparing AI perspectives. --- ## Troubleshooting ### Low Quality Score (<95) **Cause**: Gaps in coverage or missing citations **Solution**: Automatic iteration 2 fills gaps **Check**: `.analysis/verification/cross-check.md` for details ### Missing Citations **Cause**: Verification flags uncited claims **Solution**: Iteration adds missing attributions **Prevention**: All agents trained to cite sources ### Gemini/Codex Unavailable **Fallback**: Claude-only analysis with warning **Impact**: Reduced perspectives but still comprehensive **Install**: `npm install -g @google/gemini-cli @openai/codex` ### Conflicting Information **Resolution**: Synthesis phase investigates conflicts **Method**: Check ground truth (actual code/docs) **Output**: Documented reasoning for resolution --- ## Related Skills - `anthropic-expert`: Anthropic product expertise - `codex-cli`: Codex integration patterns - `gemini-cli`: Gemini integration patterns - `tri-ai-collaboration`: General tri-AI workflows - `analysis`: Code/skill/process analysis --- ## Quick Reference ### Command Line ```bash # Full automated analysis bash .claude/skills/multi-ai-research/scripts/analyze.sh "objective" # Interactive with Claude Code # Just ask: "Use multi-ai-research for [objective]" ``` ### File Locations | File | Purpose | |------|---------| | `.analysis/ANALYSIS_PLAN.md` | Strategy and assignments | | `.analysis/research/` | All AI research outputs | | `.analysis/SYNTHESIS_REPORT.md` | Multi-source synthesis | | `.analysis/ANALYSIS_FINAL.md` | Complete final report | ### Quality Metrics | Metric | Threshold | Meaning | |--------|-----------|---------| | Quality Score | ≥95/100 | Production-ready | | Citation Coverage | 100% | All claims sourced | | Completeness | ≥95% | All objectives met | | Critical Gaps | 0 | No missing essentials | ### Analysis Time Estimates | Type | Time | Iterations | |------|------|------------| | Security | 45-60 min | 1-2 | | Architecture | 60-90 min | 1-2 | | Code Quality | 30-45 min | 1 | | Performance | 45-60 min | 1-2 | | Research | 30-60 min | 1 | --- **multi-ai-research delivers production-ready analysis through systematic multi-AI collaboration, rigorous verification, and iterative refinement - ensuring nothing is missed and every claim is verified.**