Skip to main content
Chapter 9 Production Deployment, Monitoring & Scaling

Production Architecture for MCP Servers

20 min read Lesson 33 / 40 Preview

Production Architecture for MCP Servers

Moving from development to production requires a different mindset. This lesson covers the architectural patterns for reliable, scalable MCP server deployments.

Development vs Production

Aspect Development Production
Transport stdio (local) Streamable HTTP (remote)
Process Single process Containerized, replicated
Database Local connection Connection pooling
Errors Console logging Structured logging + alerting
Secrets .env files Secret manager (Vault, AWS SSM)
Scaling Single instance Load balanced, auto-scaled

Production Architecture

                    ┌───────────────┐
                    │  Load Balancer │
                    │  (nginx/ALB)   │
                    └───────┬───────┘
                            │
              ┌─────────────┼─────────────┐
              ▼             ▼             ▼
        ┌──────────┐ ┌──────────┐ ┌──────────┐
        │ MCP Pod 1│ │ MCP Pod 2│ │ MCP Pod 3│
        └─────┬────┘ └─────┬────┘ └─────┬────┘
              │             │             │
              └──────┬──────┘─────┬───────┘
                     ▼            ▼
              ┌──────────┐ ┌──────────┐
              │ Database  │ │  Redis   │
              │  Pool     │ │  Cache   │
              └──────────┘ └──────────┘

Converting to Streamable HTTP

Production MCP servers use HTTP instead of stdio:

import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";

const app = express();
app.use(express.json());

// Health check endpoint
app.get("/health", (req, res) => {
  res.json({ status: "healthy", version: "1.0.0", uptime: process.uptime() });
});

// MCP endpoint
app.post("/mcp", async (req, res) => {
  try {
    const server = createMcpServer(); // Your server factory
    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: undefined,
    });
    await server.connect(transport);
    await transport.handleRequest(req, res);
  } catch (error) {
    console.error("MCP request failed:", error);
    res.status(500).json({ error: "Internal server error" });
  }
});

// Graceful shutdown
const httpServer = app.listen(process.env.PORT || 3000, () => {
  console.log(`MCP server listening on port ${process.env.PORT || 3000}`);
});

process.on("SIGTERM", () => {
  console.log("SIGTERM received, shutting down gracefully...");
  httpServer.close(() => {
    console.log("Server closed");
    process.exit(0);
  });
});

Environment Configuration

// src/config.ts
interface ServerConfig {
  port: number;
  logLevel: "debug" | "info" | "warn" | "error";
  database: { url: string; maxConnections: number; idleTimeout: number };
  rateLimit: { maxPerMinute: number; maxPerHour: number };
  cors: { allowedOrigins: string[] };
}

function loadConfig(): ServerConfig {
  return {
    port: parseInt(process.env.PORT || "3000"),
    logLevel: (process.env.LOG_LEVEL || "info") as ServerConfig["logLevel"],
    database: {
      url: process.env.DATABASE_URL || "",
      maxConnections: parseInt(process.env.DB_MAX_CONNECTIONS || "20"),
      idleTimeout: parseInt(process.env.DB_IDLE_TIMEOUT || "30000"),
    },
    rateLimit: {
      maxPerMinute: parseInt(process.env.RATE_LIMIT_PER_MINUTE || "60"),
      maxPerHour: parseInt(process.env.RATE_LIMIT_PER_HOUR || "1000"),
    },
    cors: {
      allowedOrigins: (process.env.CORS_ORIGINS || "").split(",").filter(Boolean),
    },
  };
}

Structured Logging

import pino from "pino";

const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
});

// Log every MCP request
function logMcpRequest(toolName: string, duration: number, status: string) {
  logger.info({
    event: "mcp_tool_call",
    tool: toolName,
    duration_ms: duration,
    status,
  });
}

Key Takeaway

Production MCP servers need HTTP transport, structured logging, environment-based configuration, health checks, and graceful shutdown. These patterns transform your development server into a reliable production service. Start with this foundation before adding Docker and Kubernetes.

Engr Mejba Ahmed

Engr Mejba Ahmed

Claude Code Expert · Online

👋

Hey there!

Quick Actions

WhatsApp Instant reply

Chat on WhatsApp

+880 1723 741224 · Instant reply

Popular Questions

Engr Mejba Ahmed is connected
Engr Mejba Ahmed is typing...
Engr Mejba Ahmed avatar

✉ Want me to follow up? Drop your email

Engr Mejba Ahmed avatar

📞 Connect Directly

Choose how you'd like to reach me

WhatsApp

+880 1723 741224

Email

[email protected]

✓ Details sent! I'll get back to you shortly.

Powered by OpenAI

335+

Blog Posts

25

AI Courses

63

Projects

Services & Expertise

Pricing & Process

Learning & Resources

Connect & Support