{ "cells": [ { "cell_type": "raw", "metadata": { "vscode": { "languageId": "raw" } }, "source": [ "# πŸš€ Production Deployment with FastMCP\n", "\n", "Welcome to production deployment with FastMCP! In this notebook, you'll learn how to deploy FastMCP servers in production environments using modern containerization and orchestration tools.\n", "\n", "## 🎯 Learning Objectives\n", "\n", "By the end of this notebook, you will:\n", "- Deploy FastMCP servers with Docker\n", "- Scale with Kubernetes\n", "- Implement monitoring and logging\n", "- Handle configuration management\n", "- Apply production security patterns\n", "\n", "## πŸ› οΈ What You'll Build\n", "\n", "```python\n", "from mcp.server.fastmcp import FastMCP\n", "from pydantic import BaseModel\n", "from prometheus_client import Counter, Histogram\n", "import structlog\n", "import sentry_sdk\n", "\n", "# Initialize FastMCP server\n", "mcp = FastMCP(\n", " name=\"production_server\",\n", " config_path=\"/etc/mcp/config.yaml\",\n", " metrics_enabled=True\n", ")\n", "\n", "# Metrics\n", "requests_total = Counter(\n", " 'mcp_requests_total', \n", " 'Total MCP requests'\n", ")\n", "request_duration = Histogram(\n", " 'mcp_request_duration_seconds',\n", " 'Request duration in seconds'\n", ")\n", "\n", "# Structured logging\n", "logger = structlog.get_logger()\n", "\n", "@mcp.tool()\n", "async def process_data(input: str) -> Dict:\n", " \"\"\"Process data with monitoring\"\"\"\n", " with request_duration.time():\n", " requests_total.inc()\n", " logger.info(\"processing_data\", input=input)\n", " # Implementation...\n", "```\n", "\n", "## πŸ” Production Patterns\n", "\n", "FastMCP provides robust production features:\n", "- **Configuration Management**\n", "- **Metrics & Monitoring**\n", "- **Structured Logging**\n", "- **Health Checks**\n", "- **Resource Management**\n", "\n", "## πŸ“š Table of Contents\n", "\n", "1. [Docker Deployment](#docker-deployment)\n", "2. [Kubernetes Setup](#kubernetes-setup)\n", "3. [Monitoring Stack](#monitoring-stack)\n", "4. [Configuration](#configuration)\n", "5. [Security](#security)\n", "6. [Best Practices](#best-practices)\n" ] }, { "cell_type": "raw", "metadata": { "vscode": { "languageId": "raw" } }, "source": [ "# 🐳 Docker Deployment\n", "\n", "First, let's create a production-ready Dockerfile for our FastMCP server:\n", "\n", "```dockerfile\n", "# Use official Python image\n", "FROM python:3.11-slim\n", "\n", "# Set working directory\n", "WORKDIR /app\n", "\n", "# Install system dependencies\n", "RUN apt-get update && apt-get install -y --no-install-recommends \\\n", " build-essential \\\n", " && rm -rf /var/lib/apt/lists/*\n", "\n", "# Install Python dependencies\n", "COPY requirements.txt .\n", "RUN pip install --no-cache-dir -r requirements.txt\n", "\n", "# Copy application code\n", "COPY . .\n", "\n", "# Create non-root user\n", "RUN useradd -m mcp && chown -R mcp:mcp /app\n", "USER mcp\n", "\n", "# Set environment variables\n", "ENV MCP_ENV=production\n", "ENV MCP_CONFIG=/app/config/production.yaml\n", "ENV PYTHONPATH=/app\n", "\n", "# Expose port\n", "EXPOSE 8000\n", "\n", "# Health check\n", "HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \\\n", " CMD curl -f http://localhost:8000/health || exit 1\n", "\n", "# Start FastMCP server\n", "CMD [\"python\", \"-m\", \"uvicorn\", \"server:mcp\", \"--host\", \"0.0.0.0\", \"--port\", \"8000\"]\n", "```\n", "\n", "And a production configuration file (`config/production.yaml`):\n", "\n", "```yaml\n", "server:\n", " name: production_mcp\n", " host: 0.0.0.0\n", " port: 8000\n", " workers: 4\n", " timeout: 30\n", "\n", "security:\n", " allowed_origins:\n", " - https://api.example.com\n", " api_key_header: X-API-Key\n", " rate_limit:\n", " requests: 100\n", " window_seconds: 60\n", "\n", "monitoring:\n", " metrics_enabled: true\n", " metrics_port: 9090\n", " log_level: INFO\n", " sentry_dsn: ${SENTRY_DSN}\n", "\n", "resources:\n", " cache:\n", " enabled: true\n", " ttl_seconds: 300\n", " database:\n", " url: ${DATABASE_URL}\n", " pool_size: 20\n", " max_overflow: 10\n", "```\n", "\n", "Build and run the container:\n", "\n", "```bash\n", "# Build image\n", "docker build -t fastmcp-server:1.0 .\n", "\n", "# Run container\n", "docker run -d \\\n", " --name mcp-server \\\n", " -p 8000:8000 \\\n", " -p 9090:9090 \\\n", " -v /path/to/config:/app/config \\\n", " -e SENTRY_DSN=your-dsn \\\n", " -e DATABASE_URL=your-url \\\n", " fastmcp-server:1.0\n", "```\n" ] }, { "cell_type": "raw", "metadata": { "vscode": { "languageId": "raw" } }, "source": [ "# ⚑ Kubernetes Deployment\n", "\n", "Now let's deploy our FastMCP server to Kubernetes. First, create the necessary Kubernetes manifests:\n", "\n", "`k8s/deployment.yaml`:\n", "```yaml\n", "apiVersion: apps/v1\n", "kind: Deployment\n", "metadata:\n", " name: mcp-server\n", " labels:\n", " app: mcp-server\n", "spec:\n", " replicas: 3\n", " selector:\n", " matchLabels:\n", " app: mcp-server\n", " template:\n", " metadata:\n", " labels:\n", " app: mcp-server\n", " annotations:\n", " prometheus.io/scrape: \"true\"\n", " prometheus.io/port: \"9090\"\n", " spec:\n", " containers:\n", " - name: mcp-server\n", " image: fastmcp-server:1.0\n", " ports:\n", " - containerPort: 8000\n", " name: http\n", " - containerPort: 9090\n", " name: metrics\n", " env:\n", " - name: SENTRY_DSN\n", " valueFrom:\n", " secretKeyRef:\n", " name: mcp-secrets\n", " key: sentry-dsn\n", " - name: DATABASE_URL\n", " valueFrom:\n", " secretKeyRef:\n", " name: mcp-secrets\n", " key: database-url\n", " resources:\n", " requests:\n", " memory: \"256Mi\"\n", " cpu: \"100m\"\n", " limits:\n", " memory: \"512Mi\"\n", " cpu: \"200m\"\n", " readinessProbe:\n", " httpGet:\n", " path: /health\n", " port: http\n", " initialDelaySeconds: 5\n", " periodSeconds: 10\n", " livenessProbe:\n", " httpGet:\n", " path: /health\n", " port: http\n", " initialDelaySeconds: 15\n", " periodSeconds: 20\n", " volumeMounts:\n", " - name: config\n", " mountPath: /app/config\n", " volumes:\n", " - name: config\n", " configMap:\n", " name: mcp-config\n", "```\n", "\n", "`k8s/service.yaml`:\n", "```yaml\n", "apiVersion: v1\n", "kind: Service\n", "metadata:\n", " name: mcp-server\n", "spec:\n", " selector:\n", " app: mcp-server\n", " ports:\n", " - name: http\n", " port: 80\n", " targetPort: 8000\n", " - name: metrics\n", " port: 9090\n", " targetPort: 9090\n", " type: LoadBalancer\n", "```\n", "\n", "`k8s/configmap.yaml`:\n", "```yaml\n", "apiVersion: v1\n", "kind: ConfigMap\n", "metadata:\n", " name: mcp-config\n", "data:\n", " production.yaml: |\n", " server:\n", " name: production_mcp\n", " host: 0.0.0.0\n", " port: 8000\n", " workers: 4\n", " timeout: 30\n", " # ... rest of config ...\n", "```\n", "\n", "`k8s/secrets.yaml`:\n", "```yaml\n", "apiVersion: v1\n", "kind: Secret\n", "metadata:\n", " name: mcp-secrets\n", "type: Opaque\n", "data:\n", " sentry-dsn: base64_encoded_dsn\n", " database-url: base64_encoded_url\n", "```\n", "\n", "Deploy to Kubernetes:\n", "```bash\n", "# Apply manifests\n", "kubectl apply -f k8s/\n", "\n", "# Check status\n", "kubectl get pods -l app=mcp-server\n", "kubectl get services mcp-server\n", "\n", "# View logs\n", "kubectl logs -l app=mcp-server\n", "\n", "# Scale replicas\n", "kubectl scale deployment mcp-server --replicas=5\n", "```\n" ] }, { "cell_type": "raw", "metadata": { "vscode": { "languageId": "raw" } }, "source": [ "# πŸ“Š Monitoring Stack\n", "\n", "Let's set up monitoring for our FastMCP server using Prometheus and Grafana:\n", "\n", "`k8s/prometheus.yaml`:\n", "```yaml\n", "apiVersion: v1\n", "kind: ConfigMap\n", "metadata:\n", " name: prometheus-config\n", "data:\n", " prometheus.yml: |\n", " global:\n", " scrape_interval: 15s\n", " scrape_configs:\n", " - job_name: 'mcp-servers'\n", " kubernetes_sd_configs:\n", " - role: pod\n", " relabel_configs:\n", " - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]\n", " action: keep\n", " regex: true\n", " - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port]\n", " action: replace\n", " target_label: __metrics_path__\n", " regex: (.+)\n", "\n", "---\n", "apiVersion: apps/v1\n", "kind: Deployment\n", "metadata:\n", " name: prometheus\n", "spec:\n", " selector:\n", " matchLabels:\n", " app: prometheus\n", " template:\n", " metadata:\n", " labels:\n", " app: prometheus\n", " spec:\n", " containers:\n", " - name: prometheus\n", " image: prom/prometheus\n", " ports:\n", " - containerPort: 9090\n", " volumeMounts:\n", " - name: config\n", " mountPath: /etc/prometheus\n", " volumes:\n", " - name: config\n", " configMap:\n", " name: prometheus-config\n", "```\n", "\n", "`k8s/grafana.yaml`:\n", "```yaml\n", "apiVersion: apps/v1\n", "kind: Deployment\n", "metadata:\n", " name: grafana\n", "spec:\n", " selector:\n", " matchLabels:\n", " app: grafana\n", " template:\n", " metadata:\n", " labels:\n", " app: grafana\n", " spec:\n", " containers:\n", " - name: grafana\n", " image: grafana/grafana\n", " ports:\n", " - containerPort: 3000\n", " env:\n", " - name: GF_SECURITY_ADMIN_PASSWORD\n", " valueFrom:\n", " secretKeyRef:\n", " name: grafana-secrets\n", " key: admin-password\n", "```\n", "\n", "Example Grafana dashboard for FastMCP metrics:\n", "```json\n", "{\n", " \"title\": \"FastMCP Dashboard\",\n", " \"panels\": [\n", " {\n", " \"title\": \"Requests per Second\",\n", " \"type\": \"graph\",\n", " \"datasource\": \"Prometheus\",\n", " \"targets\": [\n", " {\n", " \"expr\": \"rate(mcp_requests_total[5m])\",\n", " \"legendFormat\": \"{{instance}}\"\n", " }\n", " ]\n", " },\n", " {\n", " \"title\": \"Request Duration\",\n", " \"type\": \"heatmap\",\n", " \"datasource\": \"Prometheus\",\n", " \"targets\": [\n", " {\n", " \"expr\": \"rate(mcp_request_duration_seconds_bucket[5m])\",\n", " \"format\": \"heatmap\"\n", " }\n", " ]\n", " }\n", " ]\n", "}\n", "```\n", "\n", "Deploy monitoring stack:\n", "```bash\n", "# Apply Prometheus and Grafana\n", "kubectl apply -f k8s/prometheus.yaml\n", "kubectl apply -f k8s/grafana.yaml\n", "\n", "# Access dashboards\n", "kubectl port-forward svc/grafana 3000:3000\n", "```\n" ] }, { "cell_type": "raw", "metadata": { "vscode": { "languageId": "raw" } }, "source": [ "# πŸš€ Running in Production\n", "\n", "Let's implement a production-ready FastMCP server with all the features we've discussed:\n", "\n", "```python\n", "import os\n", "from typing import Dict, Optional\n", "from datetime import datetime\n", "import structlog\n", "import sentry_sdk\n", "from prometheus_client import Counter, Histogram\n", "from pydantic import BaseModel\n", "\n", "from mcp.server.fastmcp import FastMCP\n", "from mcp.monitoring import MetricsMiddleware\n", "from mcp.security import RateLimiter, APIKeyAuth\n", "\n", "# Initialize logging\n", "structlog.configure(\n", " processors=[\n", " structlog.processors.TimeStamper(fmt=\"iso\"),\n", " structlog.processors.JSONRenderer()\n", " ]\n", ")\n", "logger = structlog.get_logger()\n", "\n", "# Initialize error tracking\n", "sentry_sdk.init(dsn=os.getenv(\"SENTRY_DSN\"))\n", "\n", "# Initialize metrics\n", "requests_total = Counter(\n", " 'mcp_requests_total', \n", " 'Total MCP requests',\n", " ['endpoint', 'status']\n", ")\n", "request_duration = Histogram(\n", " 'mcp_request_duration_seconds',\n", " 'Request duration in seconds',\n", " ['endpoint']\n", ")\n", "\n", "# Data models\n", "class ProcessingResult(BaseModel):\n", " \"\"\"Result model\"\"\"\n", " id: str\n", " status: str\n", " timestamp: datetime\n", " metadata: Optional[Dict] = None\n", "\n", "# Initialize FastMCP\n", "mcp = FastMCP(\n", " name=\"production_server\",\n", " config_path=\"/app/config/production.yaml\",\n", " metrics_enabled=True,\n", " middleware=[\n", " MetricsMiddleware(),\n", " RateLimiter(\n", " requests_per_minute=100,\n", " burst_size=20\n", " ),\n", " APIKeyAuth(\n", " header_name=\"X-API-Key\",\n", " key_validator=lambda k: k == os.getenv(\"API_KEY\")\n", " )\n", " ]\n", ")\n", "\n", "# Resource provider\n", "@mcp.resource(\"data://{id}\")\n", "async def get_data(id: str) -> Dict:\n", " \"\"\"Get data by ID with caching\"\"\"\n", " try:\n", " # In production, implement caching here\n", " logger.info(\"fetching_data\", id=id)\n", " with request_duration.labels(endpoint=\"get_data\").time():\n", " # Implementation...\n", " return {\"id\": id, \"value\": \"data\"}\n", " except Exception as e:\n", " logger.error(\"data_fetch_error\", id=id, error=str(e))\n", " sentry_sdk.capture_exception(e)\n", " return {\"error\": str(e)}\n", "\n", "# Tool implementation\n", "@mcp.tool()\n", "async def process_data(input_data: str) -> Dict:\n", " \"\"\"\n", " Process data with full production features\n", " \n", " Args:\n", " input_data: Data to process\n", " \n", " Returns:\n", " Processing results with metadata\n", " \"\"\"\n", " try:\n", " logger.info(\"processing_data\", input=input_data)\n", " \n", " with request_duration.labels(endpoint=\"process_data\").time():\n", " # Implementation...\n", " result = ProcessingResult(\n", " id=\"123\",\n", " status=\"completed\",\n", " timestamp=datetime.now(),\n", " metadata={\"source\": \"tool\"}\n", " )\n", " \n", " requests_total.labels(\n", " endpoint=\"process_data\",\n", " status=\"success\"\n", " ).inc()\n", " \n", " return {\n", " \"success\": True,\n", " \"data\": result.dict()\n", " }\n", " \n", " except Exception as e:\n", " logger.error(\"processing_error\", \n", " input=input_data,\n", " error=str(e))\n", " sentry_sdk.capture_exception(e)\n", " \n", " requests_total.labels(\n", " endpoint=\"process_data\",\n", " status=\"error\"\n", " ).inc()\n", " \n", " return {\n", " \"success\": False,\n", " \"error\": str(e),\n", " \"prompt\": \"error_prompt\"\n", " }\n", "\n", "if __name__ == \"__main__\":\n", " # Run production server\n", " mcp.run(\n", " host=\"0.0.0.0\",\n", " port=8000,\n", " workers=4,\n", " log_level=\"INFO\"\n", " )\n", "```\n", "\n", "## 🎯 Key Takeaways\n", "\n", "1. **Configuration**\n", " - Use YAML for config\n", " - Environment variables for secrets\n", " - Separate dev/prod settings\n", "\n", "2. **Monitoring**\n", " - Prometheus metrics\n", " - Grafana dashboards\n", " - Structured logging\n", " - Error tracking\n", "\n", "3. **Security**\n", " - Rate limiting\n", " - API key auth\n", " - Secure defaults\n", " - Input validation\n", "\n", "4. **Deployment**\n", " - Docker containers\n", " - Kubernetes orchestration\n", " - Health checks\n", " - Resource management\n", "\n", "## πŸ”œ Next Steps\n", "\n", "Continue to [Testing Strategies](16_testing_strategies.ipynb) to learn how to thoroughly test your FastMCP servers!\n" ] }, { "cell_type": "raw", "metadata": { "vscode": { "languageId": "raw" } }, "source": [ "# πŸš€ Production Deployment of MCP Servers\n", "\n", "Welcome to the advanced world of MCP production deployment! This notebook covers enterprise-grade deployment strategies, monitoring, scaling, and best practices for running MCP servers in production environments.\n", "\n", "## 🎯 Learning Objectives\n", "\n", "By the end of this notebook, you will:\n", "- Deploy MCP servers using Docker and Kubernetes\n", "- Implement comprehensive monitoring and logging\n", "- Set up auto-scaling and load balancing\n", "- Configure security hardening for production\n", "- Implement CI/CD pipelines for MCP applications\n", "\n", "## πŸ—οΈ Deployment Architectures\n", "\n", "We'll explore different deployment patterns:\n", "\n", "1. **🐳 Containerized Deployment** - Docker-based single-server deployment\n", "2. **☸️ Kubernetes Orchestration** - Scalable, resilient multi-server deployment \n", "3. **🌐 Serverless Functions** - Event-driven, auto-scaling deployment\n", "4. **πŸ”„ Blue-Green Deployment** - Zero-downtime deployment strategy\n", "5. **πŸ“Š Multi-Region Setup** - Global, distributed MCP infrastructure\n", "\n", "## πŸ” Production Considerations\n", "\n", "- **Performance Monitoring** - Metrics, alerting, and observability\n", "- **Security Hardening** - Authentication, authorization, and encryption\n", "- **Disaster Recovery** - Backup strategies and failover mechanisms\n", "- **Cost Optimization** - Resource efficiency and budget management\n", "\n", "## πŸ“š Table of Contents\n", "\n", "1. [Production Readiness Checklist](#readiness-checklist)\n", "2. [Containerization with Docker](#docker-deployment)\n", "3. [Kubernetes Orchestration](#kubernetes-deployment)\n", "4. [Monitoring and Observability](#monitoring)\n", "5. [Security and Compliance](#security)\n", "6. [Scaling and Performance](#scaling)\n", "7. [CI/CD Integration](#cicd)\n", "8. [Disaster Recovery](#disaster-recovery)\n" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 2 }