---
name: mcp-server-enhancement
description: Guide for safely adding new MCP tools to the AI Counsel server
when_to_use: |
  Use this skill when you need to:
  - Add a new MCP tool to server.py
  - Extend the MCP server with additional deliberation capabilities
  - Create new tool handlers with proper stdio safety
  - Ensure new tools follow MCP protocol standards
  - Add tools that interact with existing engine/storage components
tags: [mcp, protocol, server, tools, stdio]
---

# MCP Server Enhancement Skill

This skill provides a systematic approach to extending the AI Counsel MCP server (`server.py`) with new tools while maintaining protocol compliance, stdio safety, and proper error handling.

## Architecture Overview

The AI Counsel MCP server communicates via **stdio** (stdin/stdout) using the Model Context Protocol. Key architectural constraints:

- **Stdio Safety**: stdout is RESERVED for MCP protocol JSON. All logging MUST go to file (`mcp_server.log`) or stderr
- **Protocol Compliance**: Tools must follow MCP specification for request/response format
- **Type Safety**: Use Pydantic models for all request/response validation
- **Error Isolation**: Tool failures should return structured error responses, not crash the server
- **Async First**: All tool handlers are async functions using asyncio

## Current Tool Architecture

### Tool 1: `deliberate` (Primary Tool)
- **Purpose**: Multi-round AI model deliberation with consensus building
- **Handler**: `call_tool()` function (lines 242-327 in server.py)
- **Request Model**: `DeliberateRequest` (models/schema.py)
- **Response Model**: `DeliberationResult` (models/schema.py)
- **Engine**: Uses `DeliberationEngine.execute()` for orchestration

### Tool 2: `query_decisions` (Decision Graph Tool)
- **Purpose**: Search and analyze past deliberations in decision graph memory
- **Handler**: `handle_query_decisions()` function (lines 329-415 in server.py)
- **Request Schema**: Inline in `list_tools()` (lines 196-237)
- **Response**: Custom JSON structure (not a Pydantic model)
- **Conditional**: Only exposed if `config.decision_graph.enabled == True`

## Step-by-Step: Adding a New MCP Tool

### Step 1: Define Pydantic Request/Response Models

**Location**: `models/schema.py`

Create type-safe models for your tool's inputs and outputs:

```python
# In models/schema.py

class NewToolRequest(BaseModel):
    """Model for new_tool request."""

    parameter1: str = Field(
        ...,
        min_length=1,
        description="Description of parameter1"
    )
    parameter2: int = Field(
        default=5,
        ge=1,
        le=10,
        description="Integer parameter with range validation"
    )
    optional_param: Optional[str] = Field(
        default=None,
        description="Optional parameter"
    )

class NewToolResponse(BaseModel):
    """Model for new_tool response."""

    status: Literal["success", "partial", "failed"] = Field(
        ...,
        description="Operation status"
    )
    result_data: str = Field(..., description="Main result data")
    metadata: dict = Field(default_factory=dict, description="Additional metadata")
```

**Best Practices**:
- Use `Field()` with descriptive text for all fields (helps MCP client documentation)
- Use `Literal` types for enums (status fields, modes, etc.)
- Apply validation constraints (`min_length`, `ge`, `le`) at the model level
- Provide sensible defaults for optional parameters
- Use `Optional[]` for truly optional fields

### Step 2: Add Tool Definition to `list_tools()`

**Location**: `server.py`, inside `list_tools()` function

Add your tool to the tools list returned by the MCP server:

```python
@app.list_tools()
async def list_tools() -> list[Tool]:
    """List available MCP tools."""
    tools = [
        # Existing deliberate tool...
        Tool(
            name="deliberate",
            description=(...),
            inputSchema={...},
        ),

        # Your new tool
        Tool(
            name="new_tool",
            description=(
                "Clear, concise description of what this tool does. "
                "Include use cases and examples. Make it helpful for "
                "Claude Code users who will invoke this tool.\n\n"
                "Example usage:\n"
                '  {"parameter1": "example", "parameter2": 5}\n\n'
                "Expected behavior: Explain what the tool will do."
            ),
            inputSchema={
                "type": "object",
                "properties": {
                    "parameter1": {
                        "type": "string",
                        "description": "Description matching your Pydantic model",
                        "minLength": 1,
                    },
                    "parameter2": {
                        "type": "integer",
                        "description": "Integer parameter",
                        "minimum": 1,
                        "maximum": 10,
                        "default": 5,
                    },
                    "optional_param": {
                        "type": "string",
                        "description": "Optional parameter",
                    },
                },
                "required": ["parameter1"],  # Only required fields
            },
        ),
    ]

    return tools
```

**Best Practices**:
- **inputSchema MUST match your Pydantic model** (field names, types, constraints)
- Use JSON Schema types: `string`, `integer`, `number`, `boolean`, `array`, `object`
- Constraints: `minLength`, `maxLength`, `minimum`, `maximum`, `minItems`, `maxItems`
- Provide examples in the description (helps Claude Code understand usage)
- Multi-line descriptions are encouraged for clarity

**Conditional Tools** (like `query_decisions`):
```python
# Add tool only if config enables it
if hasattr(config, "feature_name") and config.feature_name and config.feature_name.enabled:
    tools.append(
        Tool(name="conditional_tool", description=(...), inputSchema={...})
    )
```

### Step 3: Create Tool Handler Function

**Location**: `server.py`, typically before `main()` function

Create an async handler function for your tool's logic:

```python
async def handle_new_tool(arguments: dict) -> list[TextContent]:
    """
    Handle new_tool MCP tool call.

    Args:
        arguments: Tool arguments as dict (validated by MCP client)

    Returns:
        List of TextContent with JSON response

    Raises:
        Exception: Caught and converted to error response
    """
    try:
        # Step 1: Validate request with Pydantic
        logger.info(f"Validating new_tool request: {arguments}")
        request = NewToolRequest(**arguments)

        # Step 2: Execute your tool's logic
        logger.info(f"Processing new_tool: {request.parameter1}")

        # Example: Call engine or storage components
        # result_data = await some_engine.process(request.parameter1)
        result_data = f"Processed: {request.parameter1}"

        # Step 3: Build response model
        response_model = NewToolResponse(
            status="success",
            result_data=result_data,
            metadata={"parameter2_used": request.parameter2}
        )

        # Step 4: Serialize to JSON
        result_json = json.dumps(response_model.model_dump(), indent=2)
        logger.info(f"new_tool complete: {len(result_json)} chars")

        # Step 5: Return as TextContent
        return [TextContent(type="text", text=result_json)]

    except ValidationError as e:
        # Pydantic validation failure
        logger.error(f"Validation error in new_tool: {e}", exc_info=True)
        error_response = {
            "error": f"Invalid parameters: {str(e)}",
            "error_type": "ValidationError",
            "status": "failed",
        }
        return [TextContent(type="text", text=json.dumps(error_response, indent=2))]

    except Exception as e:
        # General error handling
        logger.error(f"Error in new_tool: {type(e).__name__}: {e}", exc_info=True)
        error_response = {
            "error": str(e),
            "error_type": type(e).__name__,
            "status": "failed",
        }
        return [TextContent(type="text", text=json.dumps(error_response, indent=2))]
```

**Best Practices**:
- Always use try-except to catch errors gracefully
- Log liberally to `mcp_server.log` (helps debugging)
- Return structured error responses (don't raise exceptions to MCP layer)
- Use Pydantic's `model_dump()` for serialization (ensures consistency)
- Separate validation errors from general errors for better diagnostics

### Step 4: Route Tool Calls in `call_tool()`

**Location**: `server.py`, inside `call_tool()` function (around line 242)

Add routing logic to dispatch your new tool:

```python
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    """
    Handle tool calls from MCP client.

    Args:
        name: Tool name
        arguments: Tool arguments as dict

    Returns:
        List of TextContent with JSON response
    """
    logger.info(f"Tool call received: {name} with arguments: {arguments}")

    # Route to appropriate handler
    if name == "new_tool":
        return await handle_new_tool(arguments)
    elif name == "query_decisions":
        return await handle_query_decisions(arguments)
    elif name == "deliberate":
        # Inline handler for deliberate (existing code)
        try:
            request = DeliberateRequest(**arguments)
            result = await engine.execute(request)
            # ... rest of deliberate logic
        except Exception as e:
            # ... error handling
    else:
        # Unknown tool error
        error_msg = f"Unknown tool: {name}"
        logger.error(error_msg)
        raise ValueError(error_msg)
```

**Best Practices**:
- Use early returns for clarity (avoid deep nesting)
- Keep routing logic simple (just dispatch, don't implement logic here)
- Log the tool name and arguments on entry (debugging aid)
- Raise `ValueError` for unknown tools (MCP client will handle gracefully)

### Step 5: Write Tests

**Location**: Create `tests/unit/test_new_tool.py` and `tests/integration/test_new_tool_integration.py`

#### Unit Tests (Fast, No Dependencies)

```python
# tests/unit/test_new_tool.py
import pytest
from models.schema import NewToolRequest, NewToolResponse
from pydantic import ValidationError

def test_new_tool_request_validation():
    """Test NewToolRequest validates correctly."""
    # Valid request
    req = NewToolRequest(parameter1="test", parameter2=7)
    assert req.parameter1 == "test"
    assert req.parameter2 == 7

    # Invalid: parameter2 out of range
    with pytest.raises(ValidationError):
        NewToolRequest(parameter1="test", parameter2=11)

    # Invalid: missing required parameter
    with pytest.raises(ValidationError):
        NewToolRequest(parameter2=5)

def test_new_tool_response_serialization():
    """Test NewToolResponse serializes correctly."""
    resp = NewToolResponse(
        status="success",
        result_data="test result",
        metadata={"key": "value"}
    )
    data = resp.model_dump()
    assert data["status"] == "success"
    assert data["result_data"] == "test result"
    assert data["metadata"]["key"] == "value"
```

#### Integration Tests (Real Server Invocation)

```python
# tests/integration/test_new_tool_integration.py
import pytest
import json
from unittest.mock import AsyncMock, MagicMock
from mcp.types import TextContent

# Import your handler
from server import handle_new_tool

@pytest.mark.asyncio
async def test_handle_new_tool_success():
    """Test handle_new_tool with valid input."""
    arguments = {"parameter1": "test", "parameter2": 5}

    result = await handle_new_tool(arguments)

    assert len(result) == 1
    assert isinstance(result[0], TextContent)

    response_data = json.loads(result[0].text)
    assert response_data["status"] == "success"
    assert "test" in response_data["result_data"]

@pytest.mark.asyncio
async def test_handle_new_tool_validation_error():
    """Test handle_new_tool with invalid input."""
    arguments = {"parameter2": 5}  # Missing required parameter1

    result = await handle_new_tool(arguments)

    assert len(result) == 1
    response_data = json.loads(result[0].text)
    assert response_data["status"] == "failed"
    assert response_data["error_type"] == "ValidationError"
```

**Testing Best Practices**:
- Test validation (valid inputs, invalid inputs, edge cases)
- Test error handling (validation errors, runtime errors)
- Test serialization (model_dump() produces correct JSON)
- Mock external dependencies (engines, storage, API calls)
- Use `pytest.mark.asyncio` for async tests

### Step 6: Update Documentation

**Location**: `CLAUDE.md`

Add your new tool to the architecture documentation:

```markdown
## Architecture

### Core Components

**MCP Server Layer** (`server.py`)
- Entry point for MCP protocol communication via stdio
- Exposes tools: `deliberate`, `query_decisions`, `new_tool` (NEW)
- Tool: `new_tool` - [Brief description of what it does]
```

Update the data flow section if your tool has unique flow characteristics.

## Critical Rules: Stdio Safety

**WHY THIS MATTERS**: The MCP server uses stdout for protocol communication. Any writes to stdout that aren't MCP protocol JSON will corrupt the communication channel and crash the server.

### Rules

1. **NEVER print() to stdout**
   - Bad: `print("Debug message")`
   - Good: `logger.info("Debug message")`

2. **NEVER write to sys.stdout**
   - Bad: `sys.stdout.write("output")`
   - Good: `sys.stderr.write("output")` or use logger

3. **Configure logging to file/stderr ONLY**
   ```python
   logging.basicConfig(
       handlers=[
           logging.FileHandler("mcp_server.log"),
           logging.StreamHandler(sys.stderr),  # NOT sys.stdout!
       ]
   )
   ```

4. **Return MCP responses via TextContent**
   - Good: `return [TextContent(type="text", text=json.dumps(response))]`
   - This is the ONLY correct way to send data to MCP client

5. **Suppress subprocess stdout if not needed**
   ```python
   # If invoking external processes in your tool
   result = subprocess.run(
       ["command"],
       stdout=subprocess.PIPE,  # Capture, don't print
       stderr=subprocess.PIPE
   )
   ```

### Testing Stdio Safety

Run your tool through the MCP client and verify:
- No garbled output in Claude Code
- Server log shows clean execution
- No "protocol error" messages from MCP client

## Error Handling Patterns

### Pattern 1: Pydantic Validation Errors

```python
try:
    request = NewToolRequest(**arguments)
except ValidationError as e:
    logger.error(f"Validation error: {e}", exc_info=True)
    return [TextContent(type="text", text=json.dumps({
        "error": f"Invalid parameters: {str(e)}",
        "error_type": "ValidationError",
        "status": "failed",
    }, indent=2))]
```

### Pattern 2: Runtime Errors

```python
try:
    result = await some_operation()
except SomeSpecificError as e:
    logger.error(f"Operation failed: {e}", exc_info=True)
    return [TextContent(type="text", text=json.dumps({
        "error": str(e),
        "error_type": type(e).__name__,
        "status": "failed",
    }, indent=2))]
```

### Pattern 3: Graceful Degradation

```python
# If optional feature unavailable, return partial result
try:
    enhanced_data = await optional_enhancement()
except Exception as e:
    logger.warning(f"Enhancement failed, using base data: {e}")
    enhanced_data = None

return [TextContent(type="text", text=json.dumps({
    "status": "success" if enhanced_data else "partial",
    "result": base_data,
    "enhanced": enhanced_data,
}, indent=2))]
```

### Pattern 4: Conditional Tool Availability

```python
# In handle_new_tool()
if not hasattr(config, "feature") or not config.feature.enabled:
    return [TextContent(type="text", text=json.dumps({
        "error": "Feature not enabled in config.yaml",
        "error_type": "ConfigurationError",
        "status": "failed",
    }, indent=2))]
```

## Integration with Existing Components

### Using DeliberationEngine

If your tool needs to trigger deliberations:

```python
from deliberation.engine import DeliberationEngine

async def handle_new_tool(arguments: dict) -> list[TextContent]:
    # Access global engine (initialized in server.py)
    request = DeliberateRequest(
        question="Generated question",
        participants=[...],
        rounds=2
    )
    result = await engine.execute(request)
    # Process result...
```

### Using DecisionGraphStorage

If your tool needs to query decision graph:

```python
from decision_graph.storage import DecisionGraphStorage
from pathlib import Path

async def handle_new_tool(arguments: dict) -> list[TextContent]:
    db_path = Path(config.decision_graph.db_path)
    if not db_path.is_absolute():
        db_path = PROJECT_DIR / db_path

    storage = DecisionGraphStorage(str(db_path))
    decisions = storage.get_all_decisions(limit=10)
    # Process decisions...
```

### Using QueryEngine

If your tool needs advanced decision graph queries:

```python
from deliberation.query_engine import QueryEngine

async def handle_new_tool(arguments: dict) -> list[TextContent]:
    engine = QueryEngine(storage)
    results = await engine.search_similar(query_text, limit=5)
    # Process results...
```

## Configuration for New Tools

If your tool needs configuration, add to `models/config.py` and `config.yaml`:

### In `models/config.py`:

```python
class NewToolConfig(BaseModel):
    """Configuration for new_tool."""
    enabled: bool = Field(default=False, description="Enable new_tool feature")
    parameter: str = Field(default="default", description="Tool-specific parameter")
    timeout: int = Field(default=60, description="Timeout in seconds")

class Config(BaseModel):
    # ... existing config ...
    new_tool: Optional[NewToolConfig] = None
```

### In `config.yaml`:

```yaml
new_tool:
  enabled: true
  parameter: "custom_value"
  timeout: 120
```

### Accessing config in handler:

```python
async def handle_new_tool(arguments: dict) -> list[TextContent]:
    if not hasattr(config, "new_tool") or not config.new_tool.enabled:
        return error_response("new_tool not enabled")

    timeout = config.new_tool.timeout
    # Use config...
```

## Testing Your New Tool End-to-End

### 1. Manual Testing via MCP Inspector

Use the MCP Inspector tool to test your tool directly:

```bash
# Install MCP Inspector
npm install -g @modelcontextprotocol/inspector

# Run inspector with your server
mcp-inspector python /path/to/server.py
```

Invoke your tool with test inputs and verify responses.

### 2. Integration with Claude Code

Add your server to `~/.claude/config/mcp.json`:

```json
{
  "mcpServers": {
    "ai-counsel": {
      "command": "python",
      "args": ["/path/to/ai-counsel/server.py"],
      "env": {}
    }
  }
}
```

Test in Claude Code:
1. Start a conversation
2. Claude Code should auto-discover your tool
3. Trigger your tool with a query that would use it
4. Verify the response is correct

### 3. Check Logs

Always check `mcp_server.log` after testing:

```bash
tail -f /path/to/ai-counsel/mcp_server.log
```

Look for:
- Tool invocation logs
- Validation successes/failures
- Error stack traces (if any)
- Performance timings

## Common Pitfalls

### Pitfall 1: inputSchema Mismatch with Pydantic Model

**Problem**: JSON Schema in `list_tools()` doesn't match Pydantic model fields.

**Symptom**: MCP client accepts invalid inputs, or rejects valid inputs.

**Solution**: Keep schemas in sync. Consider generating JSON Schema from Pydantic:

```python
from pydantic.json_schema import JsonSchemaValue

schema = NewToolRequest.model_json_schema()
# Use this schema in inputSchema (but manually clean up for MCP if needed)
```

### Pitfall 2: Forgetting to Route in `call_tool()`

**Problem**: Tool defined in `list_tools()` but not handled in `call_tool()`.

**Symptom**: MCP client can invoke tool, but server returns "Unknown tool" error.

**Solution**: Always add routing in `call_tool()` after defining tool.

### Pitfall 3: Blocking Operations in Handler

**Problem**: Tool handler does CPU-intensive or I/O-blocking work synchronously.

**Symptom**: Server becomes unresponsive, other tools timeout.

**Solution**: Use async operations or run blocking work in executor:

```python
import asyncio

async def handle_new_tool(arguments: dict) -> list[TextContent]:
    # For CPU-bound work
    result = await asyncio.to_thread(blocking_function, arg1, arg2)

    # For I/O-bound work
    async with httpx.AsyncClient() as client:
        response = await client.get("https://api.example.com")

    # Process result...
```

### Pitfall 4: Not Testing Error Cases

**Problem**: Only testing happy path, not validation failures or edge cases.

**Symptom**: Tool crashes or returns unclear errors when given bad input.

**Solution**: Write tests for every error scenario:

```python
@pytest.mark.asyncio
async def test_handle_new_tool_errors():
    # Missing required field
    result = await handle_new_tool({})
    assert "ValidationError" in result[0].text

    # Invalid value range
    result = await handle_new_tool({"parameter1": "test", "parameter2": 999})
    assert "failed" in result[0].text
```

## Checklist for Adding a New Tool

Use this checklist to ensure you've completed all steps:

- [ ] Define Pydantic request model in `models/schema.py`
- [ ] Define Pydantic response model in `models/schema.py`
- [ ] Add tool definition to `list_tools()` in `server.py`
- [ ] Ensure inputSchema matches Pydantic model exactly
- [ ] Create async handler function in `server.py`
- [ ] Add error handling (ValidationError + general exceptions)
- [ ] Add routing logic in `call_tool()`
- [ ] Write unit tests for models and validation
- [ ] Write integration tests for handler function
- [ ] Test stdio safety (no stdout contamination)
- [ ] Update `CLAUDE.md` architecture section
- [ ] Add configuration to `models/config.py` if needed
- [ ] Update `config.yaml` with default config if needed
- [ ] Test end-to-end with MCP Inspector
- [ ] Test integration with Claude Code
- [ ] Review logs for errors and performance
- [ ] Document any new dependencies in `requirements.txt`

## References

- **MCP Protocol Specification**: https://spec.modelcontextprotocol.io/
- **Pydantic Documentation**: https://docs.pydantic.dev/
- **MCP Python SDK**: https://github.com/modelcontextprotocol/python-sdk
- **AI Counsel Architecture**: `CLAUDE.md` in repository root
- **Existing Tool Implementations**: `server.py` lines 104-416

## Examples in Codebase

Study these existing implementations as reference:

1. **Simple tool with inline handler**: `deliberate` tool (lines 242-327 in server.py)
   - Shows: Pydantic validation, engine invocation, response truncation, error handling

2. **Separate handler function**: `query_decisions` tool (lines 329-415 in server.py)
   - Shows: Handler separation, conditional tool availability, storage integration

3. **Conditional tool**: Decision graph tools (lines 196-237 in server.py)
   - Shows: How to conditionally expose tools based on config

## Getting Help

If you encounter issues:

1. Check `mcp_server.log` for detailed error traces
2. Verify stdio safety (no stdout writes)
3. Test Pydantic models in isolation first
4. Use MCP Inspector for manual testing before integration
5. Review existing tool implementations for patterns
6. Ensure all dependencies are installed (`requirements.txt`)

---

**Remember**: Stdio safety is paramount. When in doubt, log to file/stderr, NEVER stdout.