ZyVOP Logo
Content That Connects
SeriesCategoriesTags
ZyVOP Logo
Content That Connects

Empowering developers and creators with cutting-edge insights, comprehensive tutorials, and innovative solutions for the digital future.

Content

  • Tags
  • Write Article

Company

  • About Us
  • Contact

Connect

  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • DMCA Policy
  • Code of Conduct

© 2026 ZyVOP. Crafted with care for the developer community.

Made with ❤️ by the ZyVOP team
All systems operational
HomeGraceful Shutdown in Node.js: Stop Dropping Requests on Every Deploy

Graceful Shutdown in Node.js: Stop Dropping Requests on Every Deploy

How to drain connections, close workers, and signal your load balancer before the process exits — on every single deploy

#Node.js graceful shutdown#SIGTERM Node.js#Docker graceful shutdown Node.js#health check Express#Kubernetes readiness probe Node.js#zero downtime deployment Node.js 2026
Z
ZyVOP

Senior Developer

May 25, 2026
7 min read
7 views
Graceful Shutdown in Node.js: Stop Dropping Requests on Every Deploy

Here is what happens when a Node.js app shuts down the wrong way:

Your CI pipeline pushes a new deploy. Docker sends SIGTERM to the container. The process exits immediately. Any request in flight at that moment gets dropped — no response, no error, just a hanging connection from the client's perspective. If you are deploying five times a day, users notice.

Graceful shutdown is the difference between a deploy that users never feel and one that generates a spike of errors in your monitoring. It is also one of the more neglected production patterns in Node.js — not because it is hard, but because it only fails in production under real traffic, which means it often goes unfixed until it causes a real incident.


What Graceful Shutdown Actually Means

When your process receives a termination signal (SIGTERM from Docker/Kubernetes, SIGINT from Ctrl+C), it should:

  1. Stop accepting new connections

  2. Finish handling all in-flight requests

  3. Drain background job workers

  4. Close database connections cleanly

  5. Flush any buffered logs

  6. Exit with code 0

If it takes too long (say, more than 30 seconds), something is stuck and the process should force-exit with code 1.


The Full Implementation

// src/server.js
import http from 'http';
import app from './app.js';
import db from './lib/db.js';
import redis from './lib/redis.js';
import logger from './lib/logger.js';
import { emailWorker, reportWorker } from './workers/index.js';

const PORT = process.env.PORT || 3000;
const server = http.createServer(app);

// Track whether we are shutting down
// Used by health check to signal load balancers to stop sending traffic
let isShuttingDown = false;

// Track active connections so we can drain them
const activeConnections = new Set();

server.on('connection', (socket) => {
  activeConnections.add(socket);
  socket.once('close', () => activeConnections.delete(socket));
});

// ─────────────────────────────────────────────────────
// Graceful shutdown handler
// ─────────────────────────────────────────────────────
async function gracefulShutdown(signal) {
  if (isShuttingDown) return;   // Prevent double-shutdown if multiple signals arrive
  isShuttingDown = true;

  logger.info({ signal }, 'Shutdown signal received, starting graceful shutdown');

  // Force exit if shutdown takes too long
  // 30s is generous — tune down to 10-15s if your requests are typically fast
  const forceExitTimer = setTimeout(() => {
    logger.error('Graceful shutdown timed out, forcing exit');
    process.exit(1);
  }, 30_000);
  forceExitTimer.unref();   // Don't let this timer keep the process alive

  try {
    // Step 1 — Stop accepting new connections
    // Existing connections finish; new ones get Connection: close
    await new Promise((resolve, reject) => {
      server.close((err) => {
        if (err) reject(err);
        else resolve();
      });
    });
    logger.info('HTTP server closed — no longer accepting connections');

    // Step 2 — Wait for in-flight requests to finish
    // server.close() stops new connections but existing ones can still have requests
    // This waits for all active sockets to close
    if (activeConnections.size > 0) {
      logger.info({ count: activeConnections.size }, 'Waiting for active connections to drain');
      await new Promise((resolve) => {
        const check = setInterval(() => {
          if (activeConnections.size === 0) {
            clearInterval(check);
            resolve();
          }
        }, 100);
      });
    }

    // Step 3 — Close BullMQ workers (finish current job, reject new ones)
    logger.info('Closing background workers...');
    await Promise.all([
      emailWorker.close(),
      reportWorker.close(),
    ]);
    logger.info('Workers closed');

    // Step 4 — Close database pool
    await db.end();
    logger.info('Database pool closed');

    // Step 5 — Close Redis connection
    await redis.quit();
    logger.info('Redis connection closed');

    clearTimeout(forceExitTimer);
    logger.info('Graceful shutdown complete');
    process.exit(0);

  } catch (err) {
    logger.error({ error: err.message }, 'Error during graceful shutdown');
    process.exit(1);
  }
}

// Listen for termination signals
process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));  // Docker, Kubernetes
process.on('SIGINT',  () => gracefulShutdown('SIGINT'));   // Ctrl+C in dev

// ─────────────────────────────────────────────────────
// Unhandled promise rejections and exceptions
// ─────────────────────────────────────────────────────
process.on('unhandledRejection', (reason, promise) => {
  logger.error({ reason, promise }, 'Unhandled promise rejection');
  // In production, treat this as fatal — exit and let the process manager restart
  gracefulShutdown('unhandledRejection');
});

process.on('uncaughtException', (err) => {
  logger.error({ error: err.message, stack: err.stack }, 'Uncaught exception');
  gracefulShutdown('uncaughtException');
});

// ─────────────────────────────────────────────────────
// Start server
// ─────────────────────────────────────────────────────
server.listen(PORT, () => {
  logger.info({ port: PORT }, 'Server started');
});

The Health Check Endpoint

A health check that returns 200 during shutdown is actively harmful — your load balancer keeps sending traffic to a process that is trying to shut down. Your health check must respect the shutdown state.

// src/routes/health.js

// Lightweight liveness check — just "is the process running?"
// Used by Docker/Kubernetes to know if the container should be restarted
router.get('/health/live', (req, res) => {
  if (isShuttingDown) {
    return res.status(503).json({ status: 'shutting_down' });
  }
  res.json({ status: 'ok' });
});

// Readiness check — "is the app ready to serve traffic?"
// Load balancers use this. Return 503 to drain traffic before shutdown.
router.get('/health/ready', async (req, res) => {
  if (isShuttingDown) {
    return res.status(503).json({
      status: 'shutting_down',
      message: 'Draining traffic',
    });
  }

  // Check actual dependencies — don't lie to the load balancer
  const checks = await Promise.allSettled([
    db.query('SELECT 1'),           // Database reachable?
    redis.ping(),                   // Redis reachable?
  ]);

  const dbOk = checks[0].status === 'fulfilled';
  const redisOk = checks[1].status === 'fulfilled';
  const healthy = dbOk && redisOk;

  res.status(healthy ? 200 : 503).json({
    status: healthy ? 'ok' : 'degraded',
    checks: {
      database: dbOk ? 'ok' : 'error',
      redis: redisOk ? 'ok' : 'error',
    },
  });
});

// Deep health check — more expensive, used for alerting not load balancing
router.get('/health/detail', authenticate, async (req, res) => {
  const [emailQueue] = await Promise.all([
    emailQueue.getJobCounts(),
  ]);

  res.json({
    status: 'ok',
    uptime: process.uptime(),
    memory: process.memoryUsage(),
    queues: { email: emailQueue },
    version: process.env.APP_VERSION || 'unknown',
  });
});

Docker and Kubernetes Config

For Docker Compose:

services:
  app:
    image: your-app
    stop_grace_period: 30s    # Give the app time to shut down before force kill
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:3000/health/live"]
      interval: 10s
      timeout: 5s
      retries: 3
      start_period: 15s

For Kubernetes:

spec:
  containers:
  - name: app
    livenessProbe:
      httpGet:
        path: /health/live
        port: 3000
      initialDelaySeconds: 10
      periodSeconds: 10

    readinessProbe:
      httpGet:
        path: /health/ready
        port: 3000
      initialDelaySeconds: 5
      periodSeconds: 5

  terminationGracePeriodSeconds: 30   # Must be >= your shutdown timeout

The readiness probe is the critical one for zero-downtime deployments. Kubernetes stops sending traffic to a pod the moment /health/ready returns non-200. Combined with the shutdown handler setting isShuttingDown = true as its first action, traffic drains before the server closes.


Common Mistakes

Not calling server.close() — Most developers only handle SIGTERM and call process.exit() directly. This drops all in-flight requests. Always close the server first.

Setting the force-exit timer too high — A 5-minute timeout means a stuck process holds a deployment slot for 5 minutes. Keep it at 15–30 seconds.

Health check ignoring shutdown state — If /health returns 200 during shutdown, load balancers keep sending traffic and requests keep arriving. The shutdown never drains. Always check isShuttingDown in your health endpoint.

Not closing the DB pool — Node.js will not exit while there are open database connections. If you do not call db.end(), the force-exit timer fires and you get exit(1) instead of exit(0).

Not waiting for BullMQ workers — A worker killed mid-job leaves the job in an indeterminate state. BullMQ will re-queue stalled jobs, but it is cleaner to call worker.close() and let the current job finish.


Testing Your Shutdown

# Start your app
node src/server.js

# In another terminal, send a curl that takes a while
curl -X POST http://localhost:3000/api/slow-endpoint &

# While that's running, send SIGTERM
kill -SIGTERM $(lsof -t -i:3000)

# The slow request should complete before the process exits
# The exit code should be 0
echo "Exit code: $?"

If your slow request completes and you see Graceful shutdown complete in the logs before the process exits, your implementation is correct.

Z

ZyVOP

Passionate developer sharing knowledge about modern web technologies and best practices.

Comments (0)

Login to post a comment.

Stay Updated

Get the latest articles delivered to your inbox.

We respect your privacy. Unsubscribe anytime.

Popular Tags

#.env.example Node.js#0x profiling#12-factor#AI agents#AI code security#AI coding tools 2026#AI-assisted development#AI-generated vulnerabilities#ALTER TABLE no lock#API Design