---
name: afrexai-mcp-engineering
description: "MCP Engineering"
---

# MCP Engineering — Complete Model Context Protocol System

Build, integrate, secure, and scale MCP servers and clients. From first server to production multi-tool architecture.

## When to Use

- Building an MCP server (any language)
- Integrating MCP tools into an AI agent
- Debugging MCP connection/auth issues
- Designing multi-server architectures
- Securing MCP endpoints for production
- Evaluating which MCP servers to use

---

## Phase 1: MCP Fundamentals

### What MCP Is
Model Context Protocol = standardized way for AI agents to call external tools. Think of it as "USB for AI" — one protocol, any tool.

### Architecture
```
Agent (Client) ←→ MCP Transport ←→ MCP Server ←→ External Service
                   (stdio/HTTP)      (your code)    (API, DB, file system)
```

### Core Concepts
| Concept | What It Does | Example |
|---------|-------------|---------|
| **Server** | Exposes tools, resources, prompts | A server wrapping the GitHub API |
| **Client** | Discovers and calls server capabilities | OpenClaw, Claude Desktop, Cursor |
| **Tool** | A callable function with typed params | `create_issue(title, body, labels)` |
| **Resource** | Read-only data the agent can access | `file://workspace/config.json` |
| **Prompt** | Reusable prompt templates | `summarize_pr(pr_url)` |
| **Transport** | How client↔server communicate | stdio (local) or HTTP+SSE (remote) |

### Transport Decision
| Factor | stdio | HTTP/SSE | Streamable HTTP |
|--------|-------|----------|-----------------|
| Setup complexity | Low | Medium | Medium |
| Multi-client | No | Yes | Yes |
| Remote access | No | Yes | Yes |
| Streaming | Via stdio | SSE | Native |
| Auth needed | No (local) | Yes | Yes |
| Best for | Local dev, single agent | Production, shared | Modern production |

**Rule:** Start with stdio for development. Move to HTTP for production or multi-agent.

---

## Phase 2: Building Your First MCP Server

### Server Brief YAML
```yaml
server_name: "[service]-mcp"
description: "[What this server does in one sentence]"
transport: stdio | http
tools:
  - name: "[verb_noun]"
    description: "[What it does — be specific for LLM tool selection]"
    params:
      - name: "[param]"
        type: "string | number | boolean | object | array"
        required: true | false
        description: "[What this param controls]"
    returns: "[What the tool returns]"
    error_cases:
      - "[When/how it fails]"
resources:
  - uri: "[protocol://path]"
    description: "[What data this exposes]"
external_dependencies:
  - "[API/service this wraps]"
auth_required: true | false
auth_method: "api_key | oauth2 | none"
```

### TypeScript Server Template (stdio)
```typescript
// server.ts — minimal MCP server
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "my-service",
  version: "1.0.0",
});

// Define a tool
server.tool(
  "get_item",                          // tool name (verb_noun)
  "Fetch an item by ID",               // description (LLM reads this)
  { id: z.string().describe("Item ID") }, // params with descriptions
  async ({ id }) => {
    try {
      const result = await fetchItem(id);
      return {
        content: [{ type: "text", text: JSON.stringify(result, null, 2) }],
      };
    } catch (error) {
      return {
        content: [{ type: "text", text: `Error: ${error.message}` }],
        isError: true,
      };
    }
  }
);

// Define a resource
server.resource(
  "config",
  "config://app",
  async (uri) => ({
    contents: [{ uri: uri.href, mimeType: "application/json", text: JSON.stringify(config) }],
  })
);

// Start
const transport = new StdioServerTransport();
await server.connect(transport);
```

### Python Server Template (stdio)
```python
# server.py — minimal MCP server
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
import json

server = Server("my-service")

@server.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_item",
            description="Fetch an item by ID",
            inputSchema={
                "type": "object",
                "properties": {
                    "id": {"type": "string", "description": "Item ID"}
                },
                "required": ["id"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_item":
        result = await fetch_item(arguments["id"])
        return [TextContent(type="text", text=json.dumps(result, indent=2))]
    raise ValueError(f"Unknown tool: {name}")

async def main():
    async with stdio_server() as (read, write):
        await server.run(read, write, server.create_initialization_options())

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())
```

### Tool Design Rules
1. **Verb-noun naming**: `create_issue`, `search_docs`, `update_config` — never `issue` or `doStuff`
2. **Descriptions are critical**: The LLM picks tools based on descriptions. Be specific. Include when NOT to use.
3. **Granular over god-tools**: `search_issues` + `get_issue` + `create_issue` beats `manage_issues`
4. **Return structured data**: JSON over prose. Let the LLM format for the user.
5. **Error messages for LLMs**: Include what went wrong AND what to try next
6. **Idempotent where possible**: `create_or_update` > `create` (prevents duplicates from retries)
7. **Limit output size**: Paginate or truncate. A 10MB response kills the context window.
8. **Include examples in descriptions**: "Search issues. Example: search_issues(query='bug label:critical')"

### Tool Description Quality Checklist
- [ ] Says what the tool DOES (not just the name restated)
- [ ] Mentions when to use vs. when NOT to use
- [ ] Each param has a description with format hints
- [ ] Return format is documented
- [ ] Edge cases mentioned (empty results, not found, etc.)

---

## Phase 3: HTTP Transport & Production Server

### HTTP Server Template (TypeScript)
```typescript
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
import express from "express";

const app = express();
app.use(express.json());

const server = new McpServer({ name: "my-service", version: "1.0.0" });
// ... register tools ...

app.post("/mcp", async (req, res) => {
  const transport = new StreamableHTTPServerTransport("/mcp", res);
  await server.connect(transport);
  await transport.handleRequest(req, res);
});

app.listen(3001, () => console.log("MCP server on :3001"));
```

### Auth Patterns

#### API Key (simplest)
```typescript
// Middleware
function authMiddleware(req, res, next) {
  const key = req.headers["x-api-key"] || req.headers.authorization?.replace("Bearer ", "");
  if (!key || !validKeys.has(key)) {
    return res.status(401).json({ error: "Invalid API key" });
  }
  req.userId = keyToUser.get(key);
  next();
}
```

#### OAuth 2.0 (for user-scoped access)
```yaml
# MCP OAuth flow
1. Client requests tool → server returns 401 with auth URL
2. User completes OAuth in browser → gets access token
3. Client stores token, includes in subsequent requests
4. Server validates token, calls external API on user's behalf
```

### Production Checklist
- [ ] Rate limiting per client/key
- [ ] Request validation (schema check before execution)
- [ ] Structured logging (request ID, tool name, latency, status)
- [ ] Health check endpoint (`/health`)
- [ ] Graceful shutdown (finish in-flight requests)
- [ ] Timeout on external calls (don't let tools hang forever)
- [ ] Output size limits (truncate large responses)
- [ ] Error categorization (4xx client vs 5xx server)
- [ ] CORS if browser clients connect
- [ ] TLS in production (always HTTPS)

---

## Phase 4: Client Integration

### OpenClaw Configuration
```yaml
# In openclaw config — stdio server
mcpServers:
  my-service:
    command: "node"
    args: ["path/to/server.js"]
    env:
      API_KEY: "{{env.MY_SERVICE_API_KEY}}"
```

```yaml
# HTTP server
mcpServers:
  my-service:
    url: "https://mcp.myservice.com/mcp"
    headers:
      Authorization: "Bearer {{env.MY_SERVICE_TOKEN}}"
```

### Claude Desktop Configuration
```json
{
  "mcpServers": {
    "my-service": {
      "command": "node",
      "args": ["/path/to/server.js"],
      "env": { "API_KEY": "your-key" }
    }
  }
}
```

### Client-Side Tool Selection
When multiple MCP servers are connected, the agent sees ALL tools. Help the agent pick correctly:

1. **Unique tool names**: Prefix if needed (`github_search` vs `jira_search`)
2. **Clear descriptions**: Disambiguate similar tools across servers
3. **Don't overload**: 20-30 tools max across all servers. Beyond that, agents get confused.

### Multi-Server Architecture
```
Agent
├── github-mcp (code: create_pr, search_code, list_issues)
├── slack-mcp (comms: send_message, search_messages)
├── postgres-mcp (data: query, list_tables)
└── internal-mcp (business: get_customer, update_pipeline)
```

**Principle:** One server per domain. Don't build a mega-server.

---

## Phase 5: Testing MCP Servers

### Test Pyramid
```
        /  E2E  \        Agent actually uses the tool
       / Integration \    Tool calls real API (sandbox)
      /    Unit       \   Business logic without MCP layer
```

### Unit Test Pattern
```typescript
// Test the tool handler directly, no MCP transport
describe("get_item", () => {
  it("returns item when found", async () => {
    mockDb.findById.mockResolvedValue({ id: "123", name: "Test" });
    const result = await getItemHandler({ id: "123" });
    expect(result.content[0].text).toContain("Test");
  });

  it("returns error for missing item", async () => {
    mockDb.findById.mockResolvedValue(null);
    const result = await getItemHandler({ id: "missing" });
    expect(result.isError).toBe(true);
  });

  it("handles API timeout gracefully", async () => {
    mockDb.findById.mockRejectedValue(new Error("timeout"));
    const result = await getItemHandler({ id: "123" });
    expect(result.isError).toBe(true);
    expect(result.content[0].text).toContain("try again");
  });
});
```

### Integration Test with MCP Inspector
```bash
# Use the MCP Inspector to manually test
npx @modelcontextprotocol/inspector node server.js

# Or use mcporter for CLI testing
mcporter call my-service.get_item id=123
mcporter list my-service --schema  # verify tool schemas
```

### Test Checklist Per Tool
- [ ] Happy path returns expected format
- [ ] Missing required params returns clear error
- [ ] Invalid param types return clear error
- [ ] Not-found cases handled (don't throw, return error content)
- [ ] Rate limit / quota exceeded handled
- [ ] Auth failure handled (expired token, invalid key)
- [ ] Large response truncated appropriately
- [ ] Timeout handled (external API slow)
- [ ] Concurrent calls don't interfere

---

## Phase 6: Common MCP Server Patterns

### 1. API Wrapper (most common)
Wrap an existing REST/GraphQL API as MCP tools.
```
External API → MCP Server → Agent
```
**Key decisions:**
- Map 1 API endpoint → 1 MCP tool (usually)
- Simplify params (agent doesn't need every API option)
- Aggregate related calls (e.g., get user + get user's repos = 1 tool)
- Cache where safe (reduce API calls)

### 2. Database Query
```
Database → MCP Server → Agent
```
**Safety rules:**
- Read-only by default. Write tools require explicit opt-in.
- Parameterized queries only. NEVER interpolate agent input into SQL.
- Row limit on all queries (agent can ask for more if needed).
- Schema as a resource (let agent discover tables/columns).

### 3. File System
```
File System → MCP Server → Agent
```
**Safety rules:**
- Sandbox to specific directories. Never allow `../` traversal.
- Read-only by default. Write requires allowlist.
- Size limits on reads. Don't send 1GB files through MCP.

### 4. Multi-Step Workflow
Some tools need to orchestrate multiple steps:
```typescript
server.tool("deploy_service", "Build, test, and deploy a service", {
  service: z.string(),
  environment: z.enum(["staging", "production"]),
}, async ({ service, environment }) => {
  // Step 1: Build
  const buildResult = await build(service);
  if (!buildResult.success) return error(`Build failed: ${buildResult.error}`);

  // Step 2: Test
  const testResult = await runTests(service);
  if (!testResult.success) return error(`Tests failed: ${testResult.summary}`);

  // Step 3: Deploy (only if build + tests pass)
  if (environment === "production") {
    // Extra safety: require confirmation resource
    return {
      content: [{
        type: "text",
        text: `Ready to deploy ${service} to production. Tests: ${testResult.passed}/${testResult.total} passed. Call confirm_deploy to proceed.`
      }]
    };
  }
  const deployResult = await deploy(service, environment);
  return success(`Deployed ${service} to ${environment}: ${deployResult.url}`);
});
```

### 5. Aggregator Server
Combine multiple data sources into unified tools:
```
GitHub + Jira + PagerDuty → DevOps MCP Server → Agent
```
One `get_service_status` tool that queries all three and returns a unified view.

---

## Phase 7: Security & Hardening

### Threat Model
| Threat | Risk | Mitigation |
|--------|------|------------|
| Prompt injection via tool output | Agent executes malicious instructions in API response | Sanitize output, strip HTML/scripts |
| Excessive permissions | Tool has write access it shouldn't | Principle of least privilege per tool |
| Data exfiltration | Agent sends sensitive data to wrong tool | Tool allowlists, audit logging |
| Denial of service | Agent calls tool in infinite loop | Rate limiting, circuit breakers |
| Credential leakage | API keys in tool responses | Strip sensitive fields from output |
| SSRF | Agent provides URL that hits internal network | URL allowlisting, no private IPs |

### Security Checklist
- [ ] Every tool has minimum required permissions
- [ ] Write operations require explicit confirmation or are behind feature flags
- [ ] API keys/secrets NEVER appear in tool responses
- [ ] Output sanitized (no HTML, no executable content)
- [ ] Rate limits per tool AND per client
- [ ] Audit log: who called what tool, when, with what params
- [ ] Input validation before any external call
- [ ] URL parameters validated against allowlist (prevent SSRF)
- [ ] Timeout on every external call (max 30s default)
- [ ] Circuit breaker: disable tool if error rate > 50% for 5 min

### Dangerous Tool Patterns (Avoid)
```
❌ server.tool("execute_sql", ..., async ({ query }) => db.raw(query))
❌ server.tool("run_command", ..., async ({ cmd }) => exec(cmd))
❌ server.tool("fetch_url", ..., async ({ url }) => fetch(url))  // SSRF
❌ server.tool("write_file", ..., async ({ path, content }) => fs.writeFile(path, content))
```

### Safe Alternatives
```
✅ Parameterized queries with allowlisted tables
✅ Predefined commands with argument validation
✅ URL allowlist + no private IP ranges
✅ Write to specific directory + filename validation
```

---

## Phase 8: Debugging & Troubleshooting

### Common Issues

| Symptom | Likely Cause | Fix |
|---------|-------------|-----|
| Tool not appearing in agent | Schema error / server not connected | Check `mcporter list` or client logs |
| "Connection refused" | Server not running or wrong port | Verify process, check port |
| Tool times out | External API slow or hanging | Add timeout, check API health |
| "Invalid params" | Schema mismatch between client/server | Verify schema with `--schema` flag |
| Agent picks wrong tool | Ambiguous descriptions | Rewrite descriptions, add "Use this when..." |
| Agent calls tool in loop | Tool returning confusing error | Return clearer error with "do NOT retry" |
| Large response crashes | No output truncation | Add pagination or character limit |
| Auth errors intermittent | Token expiry | Implement token refresh |

### Debug Workflow
1. **Verify server starts**: `node server.js` — does it start without errors?
2. **List tools**: `mcporter list my-server --schema` — are all tools registered?
3. **Call directly**: `mcporter call my-server.tool_name param=value` — does it return expected output?
4. **Check client config**: Is the server path/URL correct? Are env vars set?
5. **Read client logs**: Most clients log MCP connection errors
6. **Test with Inspector**: `npx @modelcontextprotocol/inspector` for interactive debugging

### Logging Template
```typescript
server.tool("my_tool", description, schema, async (params) => {
  const requestId = crypto.randomUUID().slice(0, 8);
  console.error(`[${requestId}] my_tool called:`, JSON.stringify(params));
  const start = Date.now();
  try {
    const result = await doWork(params);
    console.error(`[${requestId}] my_tool success: ${Date.now() - start}ms`);
    return success(result);
  } catch (error) {
    console.error(`[${requestId}] my_tool error: ${error.message} (${Date.now() - start}ms)`);
    return errorResponse(error.message);
  }
});
```

Note: Use `console.error` for logs in stdio transport (stdout is reserved for MCP protocol).

---

## Phase 9: MCP Server Selection Guide

### Evaluating Existing MCP Servers

Score 0-5 per dimension:

| Dimension | What to Check |
|-----------|--------------|
| **Maintained** | Last commit < 3 months? Issues addressed? Version > 1.0? |
| **Secure** | No raw SQL/exec? Auth implemented? Input validated? |
| **Well-typed** | Full JSON Schema for all tools? Descriptions useful? |
| **Tested** | Has tests? CI passing? |
| **Documented** | Setup instructions? Tool descriptions? Examples? |
| **Lightweight** | Minimal dependencies? Fast startup? |

**Score < 15/30**: Build your own. **Score 15-24**: Use with caution. **Score 25+**: Good to use.

### Popular MCP Server Categories
| Category | Use Case | Examples |
|----------|----------|---------|
| Code | GitHub, GitLab, code search | github-mcp, gitlab-mcp |
| Data | PostgreSQL, SQLite, Snowflake | postgres-mcp, sqlite-mcp |
| Comms | Slack, Discord, email | slack-mcp, gmail-mcp |
| Docs | Notion, Confluence, Google Docs | notion-mcp, gdocs-mcp |
| DevOps | AWS, GCP, Kubernetes, Terraform | aws-mcp, k8s-mcp |
| Search | Brave, Google, vector stores | brave-search, rag-mcp |
| Files | Local FS, S3, Google Drive | filesystem-mcp, s3-mcp |
| CRM | HubSpot, Salesforce | hubspot-mcp, sfdc-mcp |

---

## Phase 10: Architecture Patterns

### Single Agent + Multiple Servers
```
Agent ──┬── github-mcp
        ├── slack-mcp
        ├── postgres-mcp
        └── custom-mcp
```
Best for: Most use cases. Simple, effective.

### Gateway Pattern
```
Agent ── MCP Gateway ──┬── server-1
                       ├── server-2
                       └── server-3
```
Gateway handles: auth, rate limiting, logging, routing.
Best for: Enterprise, multi-tenant, compliance requirements.

### Agent-per-Domain
```
Orchestrator Agent
├── Code Agent (github-mcp, gitlab-mcp)
├── Data Agent (postgres-mcp, analytics-mcp)
└── Comms Agent (slack-mcp, email-mcp)
```
Best for: Complex workflows, specialized agents.

### Tool Count Guidelines
| Total Tools | Recommendation |
|-------------|---------------|
| 1-10 | Great. Agent handles well. |
| 10-20 | Good. Ensure distinct descriptions. |
| 20-30 | Caution. Group by server, review descriptions. |
| 30-50 | Risk. Consider agent-per-domain pattern. |
| 50+ | Dangerous. Agent WILL pick wrong tools. Split or use gateway. |

---

## Phase 11: Publishing MCP Servers

### Package Structure
```
my-mcp-server/
├── src/
│   ├── server.ts        # MCP server entry
│   ├── tools/           # Tool handlers
│   │   ├── search.ts
│   │   └── create.ts
│   ├── auth.ts          # Auth middleware
│   └── config.ts        # Configuration
├── tests/
│   ├── tools.test.ts
│   └── integration.test.ts
├── package.json
├── tsconfig.json
├── README.md            # Setup + tool docs
└── LICENSE
```

### README Template for MCP Servers
```markdown
# [Service] MCP Server

[One sentence: what this enables]

## Quick Start
[3 steps max to get running]

## Tools
| Tool | Description | Params |
|------|-------------|--------|
[Table of all tools]

## Configuration
[Env vars, auth setup]

## Examples
[2-3 real usage examples with agent conversation]
```

### npm Publishing
```bash
# package.json
{
  "name": "@myorg/service-mcp",
  "version": "1.0.0",
  "bin": { "service-mcp": "./dist/server.js" },
  "files": ["dist"],
  "keywords": ["mcp", "model-context-protocol", "ai-tools"]
}

npm publish
```

---

## Quality Rubric (0-100)

| Dimension | Weight | What to Score |
|-----------|--------|--------------|
| Tool design | 20% | Names, descriptions, granularity, params |
| Security | 20% | Auth, input validation, output sanitization, least privilege |
| Reliability | 15% | Error handling, timeouts, circuit breakers |
| Testing | 15% | Unit + integration coverage, edge cases |
| Documentation | 10% | Setup, tool docs, examples |
| Performance | 10% | Response time, output size, caching |
| Maintainability | 10% | Code structure, types, logging |

**Score 0-40**: Not production ready. **40-70**: Usable with caveats. **70-90**: Solid. **90+**: Excellent.

---

## Common Mistakes

| Mistake | Fix |
|---------|-----|
| God-tool that does everything | Split into focused tools |
| Vague tool descriptions | Write descriptions as if explaining to a new hire |
| No error handling | Every external call wrapped in try/catch |
| Returning raw API responses | Shape output for agent consumption |
| No rate limiting | Add per-tool and per-client limits |
| Ignoring output size | Paginate or truncate responses |
| Hardcoded credentials | Use env vars or secret manager |
| No logging | Can't debug what you can't see |
| Testing only happy path | Test errors, timeouts, edge cases |
| Building before checking | Search for existing MCP server first |

---

## Natural Language Commands

- "Build an MCP server for [service]" → Use Phase 2 templates
- "Add a tool to my MCP server" → Follow tool design rules
- "Secure my MCP server" → Phase 7 checklist
- "Debug MCP connection issue" → Phase 8 workflow
- "Evaluate this MCP server" → Phase 9 scoring
- "Design multi-server architecture" → Phase 10 patterns
- "Publish my MCP server" → Phase 11 structure
- "Convert REST API to MCP" → Phase 6 Pattern 1
- "Add auth to my MCP server" → Phase 3 auth patterns
- "Test my MCP server" → Phase 5 checklist
- "How many tools is too many?" → Phase 10 tool count table
- "Review my tool descriptions" → Phase 2 quality checklist