Production Architecture for MCP Servers
Moving from development to production requires a different mindset. This lesson covers the architectural patterns for reliable, scalable MCP server deployments.
Development vs Production
| Aspect | Development | Production |
|---|---|---|
| Transport | stdio (local) | Streamable HTTP (remote) |
| Process | Single process | Containerized, replicated |
| Database | Local connection | Connection pooling |
| Errors | Console logging | Structured logging + alerting |
| Secrets | .env files | Secret manager (Vault, AWS SSM) |
| Scaling | Single instance | Load balanced, auto-scaled |
Production Architecture
┌───────────────┐
│ Load Balancer │
│ (nginx/ALB) │
└───────┬───────┘
│
┌─────────────┼─────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ MCP Pod 1│ │ MCP Pod 2│ │ MCP Pod 3│
└─────┬────┘ └─────┬────┘ └─────┬────┘
│ │ │
└──────┬──────┘─────┬───────┘
▼ ▼
┌──────────┐ ┌──────────┐
│ Database │ │ Redis │
│ Pool │ │ Cache │
└──────────┘ └──────────┘
Converting to Streamable HTTP
Production MCP servers use HTTP instead of stdio:
import express from "express";
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
const app = express();
app.use(express.json());
// Health check endpoint
app.get("/health", (req, res) => {
res.json({ status: "healthy", version: "1.0.0", uptime: process.uptime() });
});
// MCP endpoint
app.post("/mcp", async (req, res) => {
try {
const server = createMcpServer(); // Your server factory
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: undefined,
});
await server.connect(transport);
await transport.handleRequest(req, res);
} catch (error) {
console.error("MCP request failed:", error);
res.status(500).json({ error: "Internal server error" });
}
});
// Graceful shutdown
const httpServer = app.listen(process.env.PORT || 3000, () => {
console.log(`MCP server listening on port ${process.env.PORT || 3000}`);
});
process.on("SIGTERM", () => {
console.log("SIGTERM received, shutting down gracefully...");
httpServer.close(() => {
console.log("Server closed");
process.exit(0);
});
});
Environment Configuration
// src/config.ts
interface ServerConfig {
port: number;
logLevel: "debug" | "info" | "warn" | "error";
database: { url: string; maxConnections: number; idleTimeout: number };
rateLimit: { maxPerMinute: number; maxPerHour: number };
cors: { allowedOrigins: string[] };
}
function loadConfig(): ServerConfig {
return {
port: parseInt(process.env.PORT || "3000"),
logLevel: (process.env.LOG_LEVEL || "info") as ServerConfig["logLevel"],
database: {
url: process.env.DATABASE_URL || "",
maxConnections: parseInt(process.env.DB_MAX_CONNECTIONS || "20"),
idleTimeout: parseInt(process.env.DB_IDLE_TIMEOUT || "30000"),
},
rateLimit: {
maxPerMinute: parseInt(process.env.RATE_LIMIT_PER_MINUTE || "60"),
maxPerHour: parseInt(process.env.RATE_LIMIT_PER_HOUR || "1000"),
},
cors: {
allowedOrigins: (process.env.CORS_ORIGINS || "").split(",").filter(Boolean),
},
};
}
Structured Logging
import pino from "pino";
const logger = pino({
level: process.env.LOG_LEVEL || "info",
formatters: {
level: (label) => ({ level: label }),
},
timestamp: pino.stdTimeFunctions.isoTime,
});
// Log every MCP request
function logMcpRequest(toolName: string, duration: number, status: string) {
logger.info({
event: "mcp_tool_call",
tool: toolName,
duration_ms: duration,
status,
});
}
Key Takeaway
Production MCP servers need HTTP transport, structured logging, environment-based configuration, health checks, and graceful shutdown. These patterns transform your development server into a reliable production service. Start with this foundation before adding Docker and Kubernetes.