--- name: delegation description: "Unified provider selection for subagent delegation. Quick decision matrix for choosing between Kimi K2.5, GLM, and MiniMax based on task type." --- # Unified Delegation Skill ## ⛔ CRITICAL: No Claude Subagents **NEVER spawn Claude models (Haiku, Sonnet, Opus) as subagents.** Enforced in `.claude/settings.local.json` deny rules. --- ## Provider Selection Matrix | Task Type | Best Provider | Why | Fallback | |-----------|--------------|-----|----------| | Complex reasoning | **Kimi K2.5** | Most intelligent, 256K context | GLM-4.7 | | Image/vision (batch) | **Kimi K2.5** | Built-in vision capability | GLM-4.6v | | Creative/brainstorming | **GLM-4.7** | Strong creative problem-solving | Kimi | | Web research | **MiniMax** | Fast, reliable, cheap | GLM | | Simple file exploration | **MiniMax** | Quick turnaround | any | | Batch operations | **GLM** | Good parallelism | MiniMax | | Code review | **MiniMax** | Fast blind-spot check | Kimi | --- ## Quick Decision Flow ``` ┌─ Is it reasoning/decisions? ──────────────────┐ │ YES → Claude does it directly │ │ NO → Delegate to subagent ↓ │ ├───────────────────────────────────────────────┤ │ │ │ ┌─ Does it need vision? ───────────────────┐ │ │ │ YES → Kimi K2.5 (or GLM-4.6v fallback) │ │ │ │ NO ↓ │ │ │ └──────────────────────────────────────────┘ │ │ │ │ ┌─ Is it complex/creative? ────────────────┐ │ │ │ Complex → Kimi K2.5 │ │ │ │ Creative → GLM-4.7 │ │ │ │ Simple → MiniMax │ │ │ └──────────────────────────────────────────┘ │ └───────────────────────────────────────────────┘ ``` --- ## Provider Profiles ### Kimi K2.5 (Most Capable) **Context:** 256K tokens | **Vision:** Yes | **Thinking mode:** Yes **Best for:** - Complex multi-step reasoning - Batch image analysis (10+ images) - Tasks requiring deep understanding - Fallback for failed GLM tasks **Launcher:** `.\scripts\start-kimi.ps1` **API Config:** ``` Base URL: https://api.moonshot.cn/anthropic/ Models: kimi-k2.5-thinking, kimi-k2-turbo-preview ``` ### GLM-4.7 (Creative) **Context:** 128K tokens | **Vision:** GLM-4.6v variant | **Thinking mode:** Yes **Best for:** - Creative brainstorming - Mathematical reasoning (95.7% AIME 2025) - Parallel batch tasks - Tool use orchestration **MCP:** `.cursor/mcp.json` (GLM-4.6v configured) ### MiniMax M2.1 (Fast & Cheap) **Context:** 128K tokens | **Vision:** VLM API | **Speed:** Fastest **Best for:** - Quick web searches - Simple file exploration - Structured data extraction - Code review for blind spots **Launcher:** `.\scripts\start-claude-minimax.ps1` **MCP:** `.cursor/mcp.json` (MiniMax configured) --- ## Delegation Patterns ### Pattern 1: Research → Claude Decides ``` 1. Claude receives task requiring research 2. Claude spawns MiniMax: "Find all uses of X in codebase" 3. MiniMax returns findings 4. Claude reasons and implements ``` ### Pattern 2: Batch Vision Analysis ``` 1. Claude needs to analyze 20 sprites 2. Claude spawns Kimi K2.5: "Analyze quality of each sprite" 3. Kimi returns analysis for all 20 4. Claude makes decisions based on report ``` ### Pattern 3: Creative Exploration ``` 1. Claude needs alternative approaches 2. Claude spawns GLM-4.7: "Brainstorm 5 solutions for X" 3. GLM returns creative options 4. Claude selects and refines best approach ``` ### Pattern 4: Code Review ``` 1. Claude writes code 2. Claude spawns MiniMax: "Check for bugs, edge cases, security issues" 3. MiniMax returns concerns 4. Claude addresses or dismisses with reasoning ``` --- ## Parallel Delegation Launch multiple subagents in a single message: ``` Task(prompt="Research X", subagent_type="general-purpose") ←─┐ Task(prompt="Research Y", subagent_type="general-purpose") ←─┼─ Parallel Task(prompt="Research Z", subagent_type="general-purpose") ←─┘ ``` **Rules:** - Independent tasks → parallel - Dependent tasks → sequential - Never chain Claude subagents --- ## Background Execution (Token Suspension) **Problem:** Claude tokens burn while waiting for subagent results. **Solution:** Use `run_in_background=true` + end turn early. ### Pattern: Fire-and-Retrieve ``` 1. Claude receives task requiring research 2. Task(prompt="...", run_in_background=true) → returns output_file 3. Claude ends turn: "Research agent dispatched. Say 'continue' for results." 4. User says "continue" 5. TaskOutput(task_id="...", block=true) → retrieves results 6. Claude synthesizes and responds ``` ### When to Use Background Execution | Scenario | Background? | Why | |----------|-------------|-----| | Research >30 sec | ✅ Yes | Saves expensive Claude wait time | | Batch image analysis | ✅ Yes | Long-running, user can wait | | Quick file lookup | ❌ No | Faster to wait inline | | Claude needs result to continue | ❌ No | Would block anyway | ### Token Savings Calculation ``` Blocking: Claude waits 60s = 60s of Opus tokens burned Background: Claude ends turn = 0s of Opus tokens burned (subagent tokens are 50x cheaper) ``` ### Example Usage ``` # Fire (spawn and end turn immediately) Task( prompt="Analyze all 20 sprites in assets/sprites/", subagent_type="general-purpose", run_in_background=true ) → Returns: {task_id: "abc123", output_file: "/path/to/output"} # ... Claude ends turn, tells user to say "continue" ... # Retrieve (on next turn) TaskOutput(task_id="abc123", block=true) → Returns: Full subagent analysis ``` --- ## Token Economics | Provider | Relative Cost | When to Use | |----------|---------------|-------------| | Claude Opus | 50x | Final decisions, complex reasoning | | Claude Sonnet | 10x | Medium reasoning (avoid as subagent) | | Kimi K2.5 | 1x | Complex tasks, vision | | GLM-4.7 | 1x | Creative, batch | | MiniMax | 1x | Fast, simple | **Key insight:** 1 hour Claude exploration = 50 hours subagent exploration (cost). --- ## Common Mistakes | Mistake | Impact | Fix | |---------|--------|-----| | Claude spawning Haiku | Expensive | Use MiniMax instead | | Sequential when parallel possible | Slow | Single message, multiple Tasks | | Kimi for simple lookup | Overkill | Use MiniMax | | MiniMax for complex reasoning | Poor quality | Use Kimi K2.5 | | Claude reading 10+ files | Context bloat | Delegate exploration | --- ## Integration with Other Skills - **`/skill kimi-k2.5`** - Detailed Kimi setup and patterns - **`/skill minimax-mcp`** - MiniMax MCP integration details - **`/skill token-efficient-delegation`** - Full token economics - **`/skill subagent-best-practices`** - General subagent patterns --- [Opus 4.5 - 2026-01-29]