---
name: model-council
description: This skill should be used when the user asks for "model council", "multi-model", "compare models", "ask multiple AIs", "consensus across models", "run on different models", or wants to get solutions from multiple AI providers (Claude, GPT, Gemini, Grok) and compare results. Orchestrates parallel execution across AI models/CLIs and synthesizes the best answer.
---

# Model Council: Multi-Model Consensus

Run the same problem through multiple AI models in parallel, collect their **analysis only**, then Claude Code synthesizes and decides the best approach.

Unlike code-council (which uses one model with multiple approaches), model-council leverages **different model architectures** for true ensemble diversity.

## Critical: Analysis Only Mode

**IMPORTANT**: External models provide analysis and recommendations ONLY. They do NOT make code changes.

- External models: Analyze, suggest, reason, compare options
- Claude Code: Synthesizes all inputs, makes final decision, implements changes

This ensures:
1. Claude Code remains in control of the codebase
2. No conflicting changes from multiple sources
3. Best ideas from all models, unified execution

## Why Multi-Model?

Different models have different:
- Training data and knowledge cutoffs
- Reasoning patterns and biases
- Strengths (math, code, creativity, etc.)

When multiple independent models agree → High confidence the answer is correct.

## Execution Modes

### Mode 1: CLI Agents (Uses Your Existing Accounts)

Call CLI tools that use your logged-in accounts - leverages existing subscriptions!

| CLI Tool | Model | Status |
|----------|-------|--------|
| `claude` | Claude (this session) | ✅ Already running |
| `codex` | OpenAI Codex | Requires setup |
| `gemini` | Google Gemini | Requires setup |
| `aider` | Multi-model | Requires setup |

#### CLI Setup Instructions

**OpenAI Codex CLI:**
```bash
# Install via npm
npm install -g @openai/codex

# Login (uses browser auth)
codex auth

# Verify
codex --version
```

**Google Gemini CLI:**
```bash
# Install via npm  
npm install -g @anthropic-ai/gemini-cli

# Or use gcloud with Vertex AI
gcloud auth application-default login

# Verify
gemini --version
```

**Aider (Multi-model, Recommended):**
```bash
# Install via pip
pip install aider-chat

# Configure with your API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

# Run with specific model
aider --model gpt-4o --message "analyze this code"
```

**Check What's Installed:**
```bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/detect_clis.py
```

### Mode 2: API Calls (Pay per token)

Direct API calls - more reliable, works without CLI setup, but costs money.

Required environment variables:
- `ANTHROPIC_API_KEY` - For Claude API (https://console.anthropic.com/)
- `OPENAI_API_KEY` - For GPT-4 API (https://platform.openai.com/api-keys)
- `GOOGLE_API_KEY` - For Gemini API (https://aistudio.google.com/apikey)
- `XAI_API_KEY` - For Grok API (https://console.x.ai/)

## Configuration

### User Model Selection

Users can specify models inline:

```
model council with claude, gpt-4o, gemini: solve this problem

model council (claude + codex): fix this bug

model council all: use all available models
```

### Default Models

If not specified, use all available:
1. Check which CLI tools are installed
2. Check which API keys are set
3. Use what's available

### Config File (Optional)

Users can create `~/.model-council.yaml`:

```yaml
# Preferred models (in order)
models:
  - claude      # Use Claude Code CLI (current session)
  - codex       # Use Codex CLI if installed
  - gemini-cli  # Use Gemini CLI if installed
  
# Fallback to APIs if CLIs not available
fallback_to_api: true

# API models to use when falling back
api_models:
  anthropic: claude-sonnet-4-20250514
  openai: gpt-4o
  google: gemini-2.0-flash
  xai: grok-3
  
# Timeout per model (seconds)
timeout: 120

# Run in parallel or sequential
parallel: true
```

## Workflow

### Step 1: Parse Model Selection

Determine which models to use:
1. Check user's inline specification (e.g., "with claude, gpt-4o")
2. If none specified, check config file
3. If no config, detect available CLIs and APIs

### Step 2: Prepare the Prompt

Format the problem for each model with **analysis-only instructions**:

```
Analyze the following problem and provide your recommendations.
DO NOT output code changes directly.
Instead, provide:
1. Your analysis of the problem
2. Recommended approach(es)
3. Potential issues or edge cases to consider
4. Trade-offs between different solutions

Problem:
[user's problem here]
```

Key rules:
- Keep the core problem identical across models
- Explicitly request analysis, not implementation
- Include relevant context (code snippets, error messages)

### Step 3: Execute in Parallel

Use the API council script to query multiple models:

```bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/api_council.py \
  --prompt "Analyze this problem and recommend solutions (do not implement): [problem]" \
  --models "claude-sonnet,gpt-4o,gemini-flash"
```

Available models:
- `claude-sonnet`, `claude-opus` - Anthropic
- `gpt-4o`, `gpt-4-turbo`, `o1` - OpenAI
- `gemini-flash`, `gemini-pro` - Google
- `grok` - xAI

List all models:
```bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/api_council.py --list-models
```

### Step 4: Collect Responses

Gather all model responses with metadata:
- Model name and version
- Response time
- Token usage (if available)
- Full response

### Step 5: Analyze Consensus

Compare responses looking for:
- **Agreement**: Do models produce the same answer/approach?
- **Unique insights**: Does one model catch something others missed?
- **Disagreements**: Where do models differ and why?

### Step 6: Claude Code Synthesizes and Decides

Claude Code (this session) uses ultrathink to:
1. Evaluate each model's analysis
2. Identify the strongest reasoning and recommendations
3. Note where models agree (high confidence) vs disagree (investigate further)
4. Make the final decision on approach
5. **Implement the solution** - only Claude Code makes code changes

This is the key difference from just asking one model:
- Multiple perspectives inform the decision
- Claude Code remains the single source of truth for implementation
- No conflicting changes from different models

### Step 7: Deliver Results

Provide:
1. **Final synthesized answer** (best combined solution)
2. **Consensus score** (how many models agreed)
3. **Individual responses** (for transparency)
4. **Insights** (what each model contributed)

## CLI Detection

To check available CLIs:
```bash
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/detect_clis.py
```

This checks for:
- `claude` - Claude Code CLI
- `codex` - OpenAI Codex CLI
- `gemini` - Gemini CLI
- `aider` - Aider (multi-model)
- `cursor` - Cursor AI (if applicable)

## Comparison: code-council vs model-council

| Aspect | code-council | model-council |
|--------|--------------|---------------|
| Models used | Claude only | Multiple (Claude, GPT, Gemini, etc.) |
| Diversity source | Different approaches | Different architectures |
| Cost | Free (uses current session) | Free (CLIs) or paid (APIs) |
| Speed | Fast (single model) | Slower (parallel calls) |
| Best for | Quick iterations | High-stakes decisions |

## When to Use Each

**Use code-council when:**
- You want fast iterations
- The problem is well-defined
- You trust Claude's reasoning

**Use model-council when:**
- High-stakes code (production, security)
- You want architectural diversity
- Models might have different knowledge
- You want to verify Claude's answer

## Error Handling

**CLI not found**: Skip that model, log warning, continue with others.

**API key missing**: Skip that provider, try CLI fallback if available.

**Timeout**: Return partial results, note which models timed out.

**No models available**: Error with setup instructions.

## Example Output

```
## Model Council Analysis Results

### Consensus: HIGH (3/3 models agree on approach)

### Summary of Recommendations:
All models recommend using a hash map for O(1) lookup. 
Key considerations raised:
- Handle null/empty input (Claude, GPT-4o)
- Consider memory vs speed tradeoff (Gemini)
- Add input validation (all models)

### Individual Analyses:

#### Claude Sonnet (API)
Analysis: The bug is caused by off-by-one error in the loop boundary.
Recommendation: Change `i <= len` to `i < len`
Edge cases noted: Empty array, single element
Confidence: High

#### GPT-4o (API)
Analysis: Loop iterates one element past array bounds.
Recommendation: Fix loop condition, add bounds check
Additional insight: Could also use forEach to avoid index errors
Confidence: High

#### Gemini Flash (API)
Analysis: Array index out of bounds on final iteration.
Recommendation: Adjust loop termination condition
Reference: Similar to common off-by-one patterns
Confidence: High

### Claude Code Decision:
Based on consensus, implementing fix with:
- Loop condition change (i < len)
- Added null check for robustness
- Unit test for edge cases

[Claude Code now implements the solution]
```