Production Architecture for MCP Servers

20 min read Lesson 33 / 40 Preview

Moving from development to production requires a different mindset. This lesson covers the architectural patterns for reliable, scalable MCP server deployments.

Development vs Production

Aspect	Development	Production
Transport	stdio (local)	Streamable HTTP (remote)
Process	Single process	Containerized, replicated
Database	Local connection	Connection pooling
Errors	Console logging	Structured logging + alerting
Secrets	.env files	Secret manager (Vault, AWS SSM)
Scaling	Single instance	Load balanced, auto-scaled

Production Architecture

                    ┌───────────────┐
                    │  Load Balancer │
                    │  (nginx/ALB)   │
                    └───────┬───────┘
                            │
              ┌─────────────┼─────────────┐
              ▼             ▼             ▼
        ┌──────────┐ ┌──────────┐ ┌──────────┐
        │ MCP Pod 1│ │ MCP Pod 2│ │ MCP Pod 3│
        └─────┬────┘ └─────┬────┘ └─────┬────┘
              │             │             │
              └──────┬──────┘─────┬───────┘
                     ▼            ▼
              ┌──────────┐ ┌──────────┐
              │ Database  │ │  Redis   │
              │  Pool     │ │  Cache   │
              └──────────┘ └──────────┘

Converting to Streamable HTTP

Production MCP servers use HTTP instead of stdio:

import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";

const app = express();
app.use(express.json());

// Health check endpoint
app.get("/health", (req, res) => {
  res.json({ status: "healthy", version: "1.0.0", uptime: process.uptime() });
});

// MCP endpoint
app.post("/mcp", async (req, res) => {
  try {
    const server = createMcpServer(); // Your server factory
    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: undefined,
    });
    await server.connect(transport);
    await transport.handleRequest(req, res);
  } catch (error) {
    console.error("MCP request failed:", error);
    res.status(500).json({ error: "Internal server error" });
  }
});

// Graceful shutdown
const httpServer = app.listen(process.env.PORT || 3000, () => {
  console.log(`MCP server listening on port ${process.env.PORT || 3000}`);
});

process.on("SIGTERM", () => {
  console.log("SIGTERM received, shutting down gracefully...");
  httpServer.close(() => {
    console.log("Server closed");
    process.exit(0);
  });
});

Environment Configuration

// src/config.ts
interface ServerConfig {
  port: number;
  logLevel: "debug" | "info" | "warn" | "error";
  database: { url: string; maxConnections: number; idleTimeout: number };
  rateLimit: { maxPerMinute: number; maxPerHour: number };
  cors: { allowedOrigins: string[] };
}

function loadConfig(): ServerConfig {
  return {
    port: parseInt(process.env.PORT || "3000"),
    logLevel: (process.env.LOG_LEVEL || "info") as ServerConfig["logLevel"],
    database: {
      url: process.env.DATABASE_URL || "",
      maxConnections: parseInt(process.env.DB_MAX_CONNECTIONS || "20"),
      idleTimeout: parseInt(process.env.DB_IDLE_TIMEOUT || "30000"),
    },
    rateLimit: {
      maxPerMinute: parseInt(process.env.RATE_LIMIT_PER_MINUTE || "60"),
      maxPerHour: parseInt(process.env.RATE_LIMIT_PER_HOUR || "1000"),
    },
    cors: {
      allowedOrigins: (process.env.CORS_ORIGINS || "").split(",").filter(Boolean),
    },
  };
}

Structured Logging

import pino from "pino";

const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Log every MCP request
function logMcpRequest(toolName: string, duration: number, status: string) {
  logger.info({
    event: "mcp_tool_call",
    tool: toolName,
    duration_ms: duration,
    status,
  });
}

Key Takeaway

Production MCP servers need HTTP transport, structured logging, environment-based configuration, health checks, and graceful shutdown. These patterns transform your development server into a reliable production service. Start with this foundation before adding Docker and Kubernetes.

Get Full Access

Production Architecture for MCP Servers

Development vs Production

Production Architecture

Converting to Streamable HTTP

Environment Configuration

Structured Logging

Key Takeaway