Skip to main content
Chapter 9 Production Deployment, Monitoring & Scaling

Production Architecture for MCP Servers

20 min read Lesson 33 / 40 Preview

Production Architecture for MCP Servers

Moving from development to production requires a different mindset. This lesson covers the architectural patterns for reliable, scalable MCP server deployments.

Development vs Production

Aspect Development Production
Transport stdio (local) Streamable HTTP (remote)
Process Single process Containerized, replicated
Database Local connection Connection pooling
Errors Console logging Structured logging + alerting
Secrets .env files Secret manager (Vault, AWS SSM)
Scaling Single instance Load balanced, auto-scaled

Production Architecture

                    ┌───────────────┐
                    │  Load Balancer │
                    │  (nginx/ALB)   │
                    └───────┬───────┘
                            │
              ┌─────────────┼─────────────┐
              ▼             ▼             ▼
        ┌──────────┐ ┌──────────┐ ┌──────────┐
        │ MCP Pod 1│ │ MCP Pod 2│ │ MCP Pod 3│
        └─────┬────┘ └─────┬────┘ └─────┬────┘
              │             │             │
              └──────┬──────┘─────┬───────┘
                     ▼            ▼
              ┌──────────┐ ┌──────────┐
              │ Database  │ │  Redis   │
              │  Pool     │ │  Cache   │
              └──────────┘ └──────────┘

Converting to Streamable HTTP

Production MCP servers use HTTP instead of stdio:

import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";

const app = express();
app.use(express.json());

// Health check endpoint
app.get("/health", (req, res) => {
  res.json({ status: "healthy", version: "1.0.0", uptime: process.uptime() });
});

// MCP endpoint
app.post("/mcp", async (req, res) => {
  try {
    const server = createMcpServer(); // Your server factory
    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: undefined,
    });
    await server.connect(transport);
    await transport.handleRequest(req, res);
  } catch (error) {
    console.error("MCP request failed:", error);
    res.status(500).json({ error: "Internal server error" });
  }
});

// Graceful shutdown
const httpServer = app.listen(process.env.PORT || 3000, () => {
  console.log(`MCP server listening on port ${process.env.PORT || 3000}`);
});

process.on("SIGTERM", () => {
  console.log("SIGTERM received, shutting down gracefully...");
  httpServer.close(() => {
    console.log("Server closed");
    process.exit(0);
  });
});

Environment Configuration

// src/config.ts
interface ServerConfig {
  port: number;
  logLevel: "debug" | "info" | "warn" | "error";
  database: { url: string; maxConnections: number; idleTimeout: number };
  rateLimit: { maxPerMinute: number; maxPerHour: number };
  cors: { allowedOrigins: string[] };
}

function loadConfig(): ServerConfig {
  return {
    port: parseInt(process.env.PORT || "3000"),
    logLevel: (process.env.LOG_LEVEL || "info") as ServerConfig["logLevel"],
    database: {
      url: process.env.DATABASE_URL || "",
      maxConnections: parseInt(process.env.DB_MAX_CONNECTIONS || "20"),
      idleTimeout: parseInt(process.env.DB_IDLE_TIMEOUT || "30000"),
    },
    rateLimit: {
      maxPerMinute: parseInt(process.env.RATE_LIMIT_PER_MINUTE || "60"),
      maxPerHour: parseInt(process.env.RATE_LIMIT_PER_HOUR || "1000"),
    },
    cors: {
      allowedOrigins: (process.env.CORS_ORIGINS || "").split(",").filter(Boolean),
    },
  };
}

Structured Logging

import pino from "pino";

const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Log every MCP request
function logMcpRequest(toolName: string, duration: number, status: string) {
  logger.info({
    event: "mcp_tool_call",
    tool: toolName,
    duration_ms: duration,
    status,
  });
}

Key Takeaway

Production MCP servers need HTTP transport, structured logging, environment-based configuration, health checks, and graceful shutdown. These patterns transform your development server into a reliable production service. Start with this foundation before adding Docker and Kubernetes.