--- name: multi-model-research description: Orchestrate multiple frontier LLMs (Claude, GPT-5.1, Gemini 3.0 Pro, Perplexity Sonar, Grok 4.1) for comprehensive research using LLM Council pattern with peer review and synthesis triggers: - "deep dive" - "research council" - "multi-model research" - "comprehensive research" - "council research" allowed-tools: Bash, Read, mcp__obsidian-vault__create_vault_file version: 0.1.0 --- # Multi-Model Research Agent Implements Karpathy's LLM Council pattern for superior research through parallel queries, peer review, and chairman synthesis. ## Architecture **Geoffrey/Claude (Native Council Member):** - Routes simple vs complex queries - Calls external API orchestrator (`research.py`) - Provides my own research response - Conducts peer review phase - Requests GPT-5.1 synthesis (chairman) - Saves final report to Obsidian **Python External API Orchestrator:** - Fetches responses from GPT-5.1, Gemini 3.0 Pro, Perplexity Sonar, Grok 4.1 - Returns JSON with all external responses - I handle all orchestration and synthesis ## When to Use This Skill Use multi-model research when: - **Complex analysis needed** - Multiple perspectives valuable - **Factual verification critical** - Cross-model validation - **Comprehensive coverage required** - No single model sufficient - **Current information essential** - Perplexity provides web grounding - **Contested topics** - Benefit from diverse model perspectives ## Simple vs Council Mode **Simple Mode** (Perplexity only): - Factual lookups - Current events - Quick research with citations - Completes in <15 seconds **Council Mode** (Full council): - Comparative analysis - Deep research - Multiple perspectives needed - Strategic questions - Completes in <90 seconds ## Workflow ### Simple Query ``` User: "What are the latest developments in quantum computing?" ↓ I decide: Simple query (factual, current) ↓ I call: uv run scripts/research.py --query "..." --models perplexity ↓ I read: JSON response from Perplexity ↓ I format: Markdown report with citations ↓ I save: To Obsidian Geoffrey/Research folder ↓ I return: Summary to user with Obsidian link ``` ### Council Query ``` User: "Compare the AI strategies of OpenAI, Anthropic, and Google" ↓ I decide: Council query (comparative, complex) ↓ I call: uv run scripts/research.py --query "..." --models gpt,gemini,perplexity,grok ↓ I read: JSON with all external responses ↓ I provide: My own (Claude) research response ↓ I conduct: Peer review (each model ranks others) ↓ I request: GPT-5.1 chairman synthesis ↓ I format: Comprehensive markdown report ↓ I save: To Obsidian Geoffrey/Research folder ↓ I return: Summary with Obsidian link ``` ## Output Format All research reports saved to Obsidian include: - **Executive Summary** (2-3 paragraphs) - **Key Findings** (organized by theme, inline citations) - **Confidence Assessment** (what's certain vs debated) - **References Section** (all sources with URLs and dates) Citations use numeric format: [1], [2], etc. ## Technical Details **Python Script:** ```bash cd skills/multi-model-research uv run scripts/research.py --query "Your question" --models perplexity --output /tmp/responses.json ``` **Config:** - `config.yaml` - Model settings, routing rules - `prompts/system_prompts.yaml` - Per-model system prompts - `prompts/peer_review.md` - Peer review template - `prompts/chairman_synthesis.md` - GPT-5.1 synthesis template **Dependencies:** - httpx (async HTTP client) - pyyaml (config parsing) - python-dotenv (env vars) - python-frontmatter (Obsidian frontmatter) **API Keys Required:** - OPENAI_API_KEY (GPT-5.1) - GEMINI_API_KEY (Gemini 3.0 Pro) - PERPLEXITY_API_KEY (Sonar Pro) - XAI_API_KEY (Grok 4.1) All keys configured in `~/.env` file. ## Examples **Simple Research:** ``` User: "What is RAG in AI?" I route to: Simple mode (Perplexity) Output: Concise explanation with current examples and citations Time: ~10 seconds ``` **Council Research:** ``` User: "Compare serverless vs containers for production ML workloads" I route to: Council mode (all 4 external + me) Process: 1. GPT-5.1: Provides comprehensive technical comparison 2. Gemini 3.0: Analyzes cost and performance trade-offs 3. Perplexity: Current industry trends and case studies 4. Grok 4.1: Developer sentiment from X/Twitter 5. Claude (me): Synthesize with nuanced analysis 6. Peer review: Each model ranks others 7. GPT-5.1 (chairman): Final synthesis Output: Multi-perspective analysis with citations Time: ~60 seconds ``` ## Limitations - **Cost**: Council mode uses 4-5 API calls per query - **Latency**: Council mode takes 60-90 seconds - **API Limits**: Rate limits may throttle parallel requests - **Citation Quality**: Non-Perplexity models require URL extraction ## Future Enhancements - Streaming responses during deliberation - Cost tracking and budget limits - Query history and versioning - Custom model weights based on topic - Integration with Geoffrey's knowledge base --- *This skill implements Karpathy's LLM Council pattern released November 22, 2025.*