Rate Limiting Express.js APIs

Shane

2/8/2026

24 min read

A practical guide to rate limiting Express.js APIs covering algorithms, express-rate-limit, Redis-backed storage, per-route limits, tiered plans, proper headers, and proxy configuration.

nodejs performance expressjs security rate-limiting api

Rate Limiting Express.js APIs

Rate limiting controls how many requests a client can make to your API within a given time window. Without it, a single client — malicious or just poorly written — can exhaust your server resources, spike your cloud bill, and ruin the experience for everyone else. If you are running a public-facing Express.js API, rate limiting is not optional; it is infrastructure.

Prerequisites

Node.js v16+ installed
Working knowledge of Express.js middleware
Basic understanding of HTTP status codes (particularly 429)
Redis installed locally or accessible remotely (for the Redis-backed sections)
Familiarity with npm package management

Why Rate Limiting Matters

Rate limiting solves three distinct problems, and you need to think about each one separately because the mitigation strategy differs.

Brute Force Prevention

Authentication endpoints are the most obvious target. Without rate limiting on /login or /auth/token, an attacker can try thousands of credential combinations per minute. A five-requests-per-minute limit on login attempts turns a brute force attack from a few hours into several years.

// Without rate limiting, this endpoint is wide open
app.post("/auth/login", function(req, res) {
  var user = authenticate(req.body.email, req.body.password);
  if (!user) {
    return res.status(401).json({ error: "Invalid credentials" });
  }
  res.json({ token: generateToken(user) });
});

An attacker running a simple loop against this endpoint can attempt hundreds of passwords per second. Rate limiting fixes this immediately.

Cost Control

If your API calls downstream services — third-party APIs, database queries, ML inference endpoints — every request costs you money. A runaway script or a misconfigured client can generate tens of thousands of requests in minutes. Rate limiting puts a ceiling on your exposure.

I have seen a single misconfigured cron job generate $2,000 in Elasticsearch query costs over a weekend because there was no rate limit on the search endpoint. That was a painful Monday morning.

Fair Usage

Multi-tenant APIs need rate limiting to prevent one noisy tenant from consuming all available capacity. This is especially important when you have free-tier users sharing infrastructure with paying customers.

Rate Limiting Algorithms Explained

There are four major algorithms used in rate limiting. Each has trade-offs. Understanding them helps you pick the right tool.

Fixed Window

The simplest approach. Divide time into fixed intervals (say, 1-minute windows). Count requests per client in each window. Reset the counter when the window expires.

Window: 12:00:00 - 12:00:59 → 100 requests allowed
Window: 12:01:00 - 12:01:59 → counter resets to 0

Problem: A client can send 100 requests at 12:00:59 and another 100 at 12:01:00, effectively making 200 requests in 2 seconds. This is the "boundary burst" problem.

When to use it: Simple internal APIs where occasional bursts are acceptable.

Sliding Window

Combines the current window count with a weighted portion of the previous window count. This smooths out the boundary burst problem.

Previous window (12:00:00 - 12:00:59): 80 requests
Current window (12:01:00 - 12:01:59): 30 requests so far
Current position: 12:01:15 (25% into current window)

Weighted count = (80 * 0.75) + 30 = 90
Limit: 100 → 10 requests remaining

When to use it: Most general-purpose API rate limiting. This is what express-rate-limit uses by default.

Token Bucket

Imagine a bucket that holds tokens. Each request removes a token. Tokens are added back at a fixed rate. If the bucket is empty, the request is rejected. The bucket has a maximum capacity, so tokens do not accumulate indefinitely.

Bucket capacity: 10 tokens
Refill rate: 1 token per second
Client sends 10 requests instantly → bucket is empty
Client waits 5 seconds → 5 tokens available
Client sends 3 requests → 2 tokens remaining

When to use it: APIs where you want to allow short bursts but enforce a sustained average rate. Payment APIs and webhook delivery systems benefit from this pattern.

Leaky Bucket

Requests enter a queue (the bucket). The queue is processed at a fixed rate. If the queue is full, new requests are rejected. Unlike token bucket, this enforces a strict output rate — requests are processed at a steady drip regardless of how they arrive.

When to use it: When you need a perfectly smooth request rate, such as upstream APIs with strict rate limits of their own.

Basic Rate Limiting with express-rate-limit

The express-rate-limit package is the de facto standard for Express.js rate limiting. It is well-maintained, has sensible defaults, and covers 90% of use cases.

npm install express-rate-limit

var express = require("express");
var rateLimit = require("express-rate-limit");

var app = express();

var globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 100,                   // limit each IP to 100 requests per window
  standardHeaders: true,      // return RateLimit-* headers
  legacyHeaders: false,       // disable X-RateLimit-* headers
  message: {
    status: 429,
    error: "Too many requests. Please try again later."
  }
});

app.use(globalLimiter);

app.get("/api/data", function(req, res) {
  res.json({ result: "Here is your data" });
});

app.listen(3000, function() {
  console.log("Server running on port 3000");
});

Configuring Windows, Max Requests, and Response Messages

The three most important configuration options are windowMs, max, and message. Get these wrong and you either block legitimate users or leave the door wide open.

var limiter = rateLimit({
  windowMs: 60 * 1000,       // 1 minute
  max: 30,                    // 30 requests per minute
  message: {
    status: 429,
    error: "Rate limit exceeded",
    retryAfter: 60             // tell the client when to retry
  },
  statusCode: 429,            // HTTP status code (429 is the default)
  skipSuccessfulRequests: false, // count all requests, not just failures
  skipFailedRequests: false,     // count failed requests too
  keyGenerator: function(req) {
    return req.ip;             // identify clients by IP (default)
  }
});

A word on choosing your window size: shorter windows (1 minute) provide tighter control but can feel aggressive to legitimate users. Longer windows (15 minutes) are more forgiving but allow larger sustained bursts. For most public APIs, 15-minute windows with generous limits work well. For authentication endpoints, use 1-minute or even 15-second windows with very low limits.

Per-Route Rate Limiting

Not all endpoints deserve the same limits. Your login endpoint should be much more restrictive than your public data endpoint. Apply different limiters to different routes.

var express = require("express");
var rateLimit = require("express-rate-limit");

var app = express();

// Strict limit for authentication
var authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 5,                     // 5 attempts per 15 minutes
  message: {
    status: 429,
    error: "Too many login attempts. Please try again in 15 minutes."
  },
  standardHeaders: true,
  legacyHeaders: false
});

// Moderate limit for API endpoints
var apiLimiter = rateLimit({
  windowMs: 60 * 1000,       // 1 minute
  max: 60,                    // 60 requests per minute
  message: {
    status: 429,
    error: "API rate limit exceeded."
  },
  standardHeaders: true,
  legacyHeaders: false
});

// Generous limit for public/static content
var publicLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 300,
  standardHeaders: true,
  legacyHeaders: false
});

// Apply different limits to different route groups
app.use("/auth", authLimiter);
app.use("/api", apiLimiter);
app.use("/public", publicLimiter);

app.post("/auth/login", function(req, res) {
  // login logic
  res.json({ token: "abc123" });
});

app.get("/api/users", function(req, res) {
  res.json({ users: [] });
});

app.get("/public/docs", function(req, res) {
  res.json({ docs: "API documentation here" });
});

app.listen(3000);

This is the pattern I use on every API I build. Authentication gets the strictest limits, API endpoints get moderate limits, and public content gets generous limits. Adjust the numbers based on your actual traffic patterns.

Identifying Clients

The default keyGenerator uses req.ip, but this is not always sufficient.

IP-Based Identification

var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 30,
  keyGenerator: function(req) {
    return req.ip;
  }
});

IP-based limiting is simple but flawed. Multiple users behind a corporate NAT share one IP. A single user with a VPN can rotate IPs. It is a reasonable default but not a complete solution.

API Key-Based Identification

var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  keyGenerator: function(req) {
    var apiKey = req.headers["x-api-key"] || req.query.api_key;
    if (!apiKey) {
      return req.ip;  // fall back to IP for unauthenticated requests
    }
    return apiKey;
  }
});

This is better for multi-tenant APIs. Each API key gets its own rate limit bucket, so one customer's heavy usage does not affect another.

User-Based Identification

var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  keyGenerator: function(req) {
    if (req.user && req.user.id) {
      return "user:" + req.user.id;
    }
    return "ip:" + req.ip;
  }
});

This requires the rate limiter to run after your authentication middleware, so the req.user object is populated.

Rate Limiting Behind Reverse Proxies

If your Express.js app runs behind Nginx, an AWS ALB, a Cloudflare proxy, or any reverse proxy, req.ip will return the proxy's IP address, not the client's. Every single client will share one rate limit bucket. This is the number one rate limiting misconfiguration I see in production.

Configuring trust proxy

var app = express();

// Trust first proxy (Nginx, ALB, etc.)
app.set("trust proxy", 1);

// For multiple proxies (e.g., Cloudflare → ALB → Express)
// app.set("trust proxy", 2);

// Trust specific proxy subnets
// app.set("trust proxy", "loopback, 10.0.0.0/8");

With trust proxy set correctly, Express reads the X-Forwarded-For header and populates req.ip with the actual client IP.

Warning: Never set trust proxy to true in production. This trusts every proxy in the chain, which means a malicious client can spoof their IP by setting X-Forwarded-For themselves. Always specify the exact number of proxies or trusted subnet.

# A spoofed request when trust proxy is too permissive
curl -H "X-Forwarded-For: 1.2.3.4" https://your-api.com/endpoint
# Your app thinks the client IP is 1.2.3.4

DigitalOcean App Platform

On DigitalOcean App Platform, the platform itself acts as a reverse proxy. Set trust proxy to 1:

app.set("trust proxy", 1);

AWS (ALB + CloudFront)

Behind CloudFront and an Application Load Balancer, you have two proxies:

app.set("trust proxy", 2);

Storing Rate Limit State

In-Memory (Default)

The default express-rate-limit store keeps counters in process memory. This works for single-instance deployments but breaks immediately when you scale.

// This is the default — no configuration needed
var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100
  // store defaults to MemoryStore
});

Problem: If you run three instances of your app, each instance has its own counter. A client can make 100 requests to instance A, 100 to instance B, and 100 to instance C — 300 total requests within your "100 per minute" limit.

Redis-Backed Storage

Redis solves the multi-instance problem. All instances share one Redis store for rate limit counters.

npm install rate-limit-redis ioredis

var express = require("express");
var rateLimit = require("express-rate-limit");
var RedisStore = require("rate-limit-redis").default;
var Redis = require("ioredis");

var app = express();

var redisClient = new Redis({
  host: process.env.REDIS_HOST || "127.0.0.1",
  port: process.env.REDIS_PORT || 6379,
  password: process.env.REDIS_PASSWORD || undefined,
  enableOfflineQueue: false,
  maxRetriesPerRequest: 1,
  retryStrategy: function(times) {
    if (times > 3) {
      return null;  // stop retrying after 3 attempts
    }
    return Math.min(times * 200, 2000);
  }
});

redisClient.on("error", function(err) {
  console.error("Redis connection error:", err.message);
});

redisClient.on("connect", function() {
  console.log("Connected to Redis for rate limiting");
});

var apiLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  standardHeaders: true,
  legacyHeaders: false,
  store: new RedisStore({
    sendCommand: function() {
      var args = Array.prototype.slice.call(arguments);
      return redisClient.call.apply(redisClient, args);
    },
    prefix: "rl:"   // prefix for Redis keys
  }),
  message: {
    status: 429,
    error: "Rate limit exceeded. Please try again later."
  }
});

app.use("/api", apiLimiter);

The prefix option is important when you use Redis for other purposes. It namespaces the rate limit keys so they do not collide with your cache or session keys.

Graceful Degradation When Redis Is Down

You do not want your API to go down just because Redis is unreachable. Implement a fallback:

var memoryStore = new rateLimit.MemoryStore();

var redisStore;
try {
  redisStore = new RedisStore({
    sendCommand: function() {
      var args = Array.prototype.slice.call(arguments);
      return redisClient.call.apply(redisClient, args);
    },
    prefix: "rl:"
  });
} catch (err) {
  console.error("Failed to create Redis store, falling back to memory:", err.message);
  redisStore = null;
}

var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  store: redisStore || memoryStore,
  standardHeaders: true,
  legacyHeaders: false
});

Rate Limit Headers

Modern APIs communicate rate limit status through response headers. The IETF has standardized these in RFC 6585 and the RateLimit header fields draft.

RateLimit-Limit: 100
RateLimit-Remaining: 67
RateLimit-Reset: 1706745600
Retry-After: 45

When standardHeaders: true is set in express-rate-limit, these headers are included automatically on every response. The Retry-After header is sent only with 429 responses.

var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  standardHeaders: true,   // sends RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset
  legacyHeaders: false,     // disables X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
  handler: function(req, res) {
    var retryAfter = Math.ceil(req.rateLimit.resetTime.getTime() / 1000 - Date.now() / 1000);
    res.set("Retry-After", String(retryAfter));
    res.status(429).json({
      status: 429,
      error: "Too many requests",
      retryAfter: retryAfter
    });
  }
});

A sample response when the limit is exceeded:

HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 1706745660
Retry-After: 45
Content-Type: application/json

{
  "status": 429,
  "error": "Too many requests",
  "retryAfter": 45
}

Graceful Handling on the Client Side

If you are building clients that consume rate-limited APIs, respect the headers:

var http = require("http");

function makeRequest(url, callback) {
  http.get(url, function(res) {
    var body = "";
    res.on("data", function(chunk) { body += chunk; });
    res.on("end", function() {
      if (res.statusCode === 429) {
        var retryAfter = parseInt(res.headers["retry-after"], 10) || 60;
        console.log("Rate limited. Retrying in " + retryAfter + " seconds...");
        setTimeout(function() {
          makeRequest(url, callback);
        }, retryAfter * 1000);
        return;
      }
      callback(null, JSON.parse(body));
    });
  });
}

For production clients, implement exponential backoff with jitter:

function retryWithBackoff(fn, attempt, maxAttempts, callback) {
  if (attempt >= maxAttempts) {
    return callback(new Error("Max retry attempts exceeded"));
  }

  fn(function(err, result) {
    if (err && err.statusCode === 429) {
      var retryAfter = err.retryAfter || Math.pow(2, attempt);
      var jitter = Math.random() * 1000;
      var delay = (retryAfter * 1000) + jitter;

      console.log("Attempt " + (attempt + 1) + " rate limited. Waiting " + Math.round(delay / 1000) + "s");

      setTimeout(function() {
        retryWithBackoff(fn, attempt + 1, maxAttempts, callback);
      }, delay);
      return;
    }
    callback(err, result);
  });
}

The jitter is critical. Without it, all clients that were rate-limited at the same time will retry simultaneously, creating a thundering herd that triggers the rate limit again.

Tiered Rate Limits (Free vs. Paid Plans)

Real-world APIs have different rate limits for different subscription tiers. Implement this with a dynamic max function:

var tierLimits = {
  free: 30,
  basic: 100,
  pro: 500,
  enterprise: 2000
};

var tieredLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: function(req) {
    if (!req.user) {
      return tierLimits.free;
    }
    return tierLimits[req.user.tier] || tierLimits.free;
  },
  keyGenerator: function(req) {
    if (req.user && req.user.id) {
      return "user:" + req.user.id;
    }
    return "ip:" + req.ip;
  },
  standardHeaders: true,
  legacyHeaders: false,
  handler: function(req, res) {
    var tier = (req.user && req.user.tier) || "free";
    var limit = tierLimits[tier];
    res.status(429).json({
      status: 429,
      error: "Rate limit exceeded",
      tier: tier,
      limit: limit,
      windowMs: 60000,
      upgrade: tier === "free" ? "Upgrade to Basic for higher limits: https://api.example.com/pricing" : undefined
    });
  }
});

// Apply after authentication middleware
app.use(authMiddleware);
app.use("/api", tieredLimiter);

This pattern nudges free-tier users toward paid plans when they hit the limit — a legitimate and effective monetization strategy.

Rate Limiting WebSocket Connections

WebSocket connections persist, so HTTP-level rate limiting does not apply after the initial handshake. You need to rate limit messages within the connection.

var WebSocket = require("ws");

var wss = new WebSocket.Server({ port: 8080 });

function createMessageLimiter(maxMessages, windowMs) {
  var clients = new Map();

  return {
    isAllowed: function(clientId) {
      var now = Date.now();
      var record = clients.get(clientId);

      if (!record || now - record.windowStart > windowMs) {
        clients.set(clientId, { windowStart: now, count: 1 });
        return true;
      }

      if (record.count >= maxMessages) {
        return false;
      }

      record.count++;
      return true;
    },
    cleanup: function() {
      var now = Date.now();
      clients.forEach(function(record, clientId) {
        if (now - record.windowStart > windowMs) {
          clients.delete(clientId);
        }
      });
    }
  };
}

var messageLimiter = createMessageLimiter(60, 60 * 1000);  // 60 messages per minute

// Clean up stale entries every 5 minutes
setInterval(function() { messageLimiter.cleanup(); }, 5 * 60 * 1000);

wss.on("connection", function(ws, req) {
  var clientId = req.socket.remoteAddress;

  ws.on("message", function(message) {
    if (!messageLimiter.isAllowed(clientId)) {
      ws.send(JSON.stringify({
        error: "Rate limit exceeded",
        retryAfter: 60
      }));
      return;
    }

    // Process message normally
    ws.send(JSON.stringify({ received: true }));
  });
});

You should also rate limit connection attempts themselves. A bot that opens and closes hundreds of WebSocket connections per second can exhaust your file descriptors even without sending messages.

DDoS vs. Application-Level Rate Limiting

This is important and often misunderstood: application-level rate limiting does not protect you from DDoS attacks.

Express rate limiting operates at Layer 7 (application layer). By the time a request reaches your Express middleware, it has already consumed network bandwidth, passed through your load balancer, and created a TCP connection. A volumetric DDoS attack overwhelms those lower layers before your rate limiter ever sees the traffic.

What application-level rate limiting does solve:

Abusive API clients making too many legitimate-looking requests
Brute force password attacks
Scraping prevention
Cost control for downstream service calls
Fair usage enforcement across tenants

What it does not solve:

Volumetric DDoS (SYN floods, UDP amplification)
Protocol-level attacks (Slowloris, HTTP/2 rapid reset)
Distributed attacks from thousands of IPs each making a small number of requests

For DDoS protection, you need infrastructure-level solutions: Cloudflare, AWS Shield, DigitalOcean Cloud Firewall, or dedicated DDoS mitigation services. Layer these with your application-level rate limiting for defense in depth.

Testing Rate Limits

Testing rate limits is straightforward but often skipped. Do not skip it — a misconfigured rate limiter either blocks legitimate users or fails to block abusive ones.

Unit Testing with Supertest

npm install --save-dev supertest mocha

var request = require("supertest");
var express = require("express");
var rateLimit = require("express-rate-limit");

describe("Rate Limiting", function() {
  var app;

  beforeEach(function() {
    app = express();
    var limiter = rateLimit({
      windowMs: 60 * 1000,
      max: 3,
      standardHeaders: true,
      legacyHeaders: false,
      message: { error: "Rate limited" }
    });
    app.use(limiter);
    app.get("/test", function(req, res) {
      res.json({ ok: true });
    });
  });

  it("should allow requests under the limit", function(done) {
    request(app)
      .get("/test")
      .expect(200)
      .expect(function(res) {
        if (res.headers["ratelimit-remaining"] !== "2") {
          throw new Error("Expected 2 remaining, got " + res.headers["ratelimit-remaining"]);
        }
      })
      .end(done);
  });

  it("should return 429 when limit is exceeded", function(done) {
    var agent = request(app);
    agent.get("/test").end(function() {
      agent.get("/test").end(function() {
        agent.get("/test").end(function() {
          agent.get("/test")
            .expect(429)
            .expect(function(res) {
              if (!res.body.error) {
                throw new Error("Expected error message in response body");
              }
            })
            .end(done);
        });
      });
    });
  });

  it("should include rate limit headers", function(done) {
    request(app)
      .get("/test")
      .expect(200)
      .expect("ratelimit-limit", "3")
      .expect(function(res) {
        if (!res.headers["ratelimit-reset"]) {
          throw new Error("Missing ratelimit-reset header");
        }
      })
      .end(done);
  });
});

Load Testing with curl

# Send 20 rapid requests and observe when rate limiting kicks in
for i in $(seq 1 20); do
  echo "Request $i: $(curl -s -o /dev/null -w '%{http_code}' http://localhost:3000/api/data)"
done

Expected output:

Request 1: 200
Request 2: 200
...
Request 10: 200
Request 11: 429
Request 12: 429
...

Verifying Headers

curl -i http://localhost:3000/api/data

HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 99
RateLimit-Reset: 1706745720
Content-Type: application/json; charset=utf-8

{"ok":true}

Complete Working Example

Here is a production-ready Express.js API with tiered rate limiting, Redis storage, per-route limits, proper headers, and graceful error handling.

// app.js
var express = require("express");
var rateLimit = require("express-rate-limit");
var RedisStore = require("rate-limit-redis").default;
var Redis = require("ioredis");

var app = express();
app.use(express.json());

// Trust proxy (DigitalOcean App Platform, Nginx, etc.)
app.set("trust proxy", 1);

// ──────────────────────────────────────────────
// Redis Connection
// ──────────────────────────────────────────────
var redisClient = new Redis({
  host: process.env.REDIS_HOST || "127.0.0.1",
  port: parseInt(process.env.REDIS_PORT, 10) || 6379,
  password: process.env.REDIS_PASSWORD || undefined,
  enableOfflineQueue: false,
  maxRetriesPerRequest: 1
});

var redisAvailable = false;

redisClient.on("connect", function() {
  redisAvailable = true;
  console.log("Redis connected for rate limiting");
});

redisClient.on("error", function(err) {
  redisAvailable = false;
  console.error("Redis error:", err.message);
});

// ──────────────────────────────────────────────
// Rate Limit Store Factory
// ──────────────────────────────────────────────
function createStore(prefix) {
  if (redisAvailable) {
    return new RedisStore({
      sendCommand: function() {
        var args = Array.prototype.slice.call(arguments);
        return redisClient.call.apply(redisClient, args);
      },
      prefix: "rl:" + prefix + ":"
    });
  }
  console.warn("Using in-memory rate limit store for " + prefix);
  return undefined;  // falls back to MemoryStore
}

// ──────────────────────────────────────────────
// Tier Configuration
// ──────────────────────────────────────────────
var tierLimits = {
  anonymous: { perMinute: 20,  perDay: 500 },
  free:      { perMinute: 30,  perDay: 1000 },
  basic:     { perMinute: 100, perDay: 5000 },
  pro:       { perMinute: 500, perDay: 50000 },
  enterprise:{ perMinute: 2000, perDay: 200000 }
};

// ──────────────────────────────────────────────
// Fake Auth Middleware (replace with your own)
// ──────────────────────────────────────────────
function authMiddleware(req, res, next) {
  var apiKey = req.headers["x-api-key"];
  if (apiKey) {
    // In production, look this up in your database
    var users = {
      "key-free-123":       { id: "u1", tier: "free" },
      "key-basic-456":      { id: "u2", tier: "basic" },
      "key-pro-789":        { id: "u3", tier: "pro" },
      "key-enterprise-000": { id: "u4", tier: "enterprise" }
    };
    req.user = users[apiKey] || null;
  }
  next();
}

// ──────────────────────────────────────────────
// Rate Limit Handler (returns 429 with Retry-After)
// ──────────────────────────────────────────────
function rateLimitHandler(req, res) {
  var resetTime = req.rateLimit.resetTime;
  var retryAfter = Math.ceil((resetTime.getTime() - Date.now()) / 1000);
  if (retryAfter < 1) retryAfter = 1;

  var tier = (req.user && req.user.tier) || "anonymous";
  var limit = tierLimits[tier];

  res.set("Retry-After", String(retryAfter));
  res.status(429).json({
    status: 429,
    error: "Rate limit exceeded",
    tier: tier,
    limit: limit.perMinute + " requests per minute",
    retryAfter: retryAfter,
    upgrade: (tier === "anonymous" || tier === "free")
      ? "Get higher limits at https://api.example.com/pricing"
      : undefined
  });
}

// ──────────────────────────────────────────────
// Global Rate Limiter (applies to all routes)
// ──────────────────────────────────────────────
var globalLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 200,
  standardHeaders: true,
  legacyHeaders: false,
  store: createStore("global"),
  message: { status: 429, error: "Global rate limit exceeded" }
});

// ──────────────────────────────────────────────
// Auth Endpoint Rate Limiter (very strict)
// ──────────────────────────────────────────────
var authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 5,
  standardHeaders: true,
  legacyHeaders: false,
  store: createStore("auth"),
  skipSuccessfulRequests: true,  // only count failed attempts
  handler: function(req, res) {
    var retryAfter = Math.ceil((req.rateLimit.resetTime.getTime() - Date.now()) / 1000);
    res.set("Retry-After", String(retryAfter));
    res.status(429).json({
      status: 429,
      error: "Too many failed login attempts. Account temporarily locked.",
      retryAfter: retryAfter
    });
  }
});

// ──────────────────────────────────────────────
// Tiered API Rate Limiter (per API key / user)
// ──────────────────────────────────────────────
var tieredApiLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: function(req) {
    var tier = (req.user && req.user.tier) || "anonymous";
    return tierLimits[tier].perMinute;
  },
  keyGenerator: function(req) {
    if (req.user && req.user.id) {
      return "user:" + req.user.id;
    }
    return "ip:" + req.ip;
  },
  standardHeaders: true,
  legacyHeaders: false,
  store: createStore("api"),
  handler: rateLimitHandler
});

// ──────────────────────────────────────────────
// Routes
// ──────────────────────────────────────────────

// Apply global limiter first
app.use(globalLimiter);

// Auth routes with strict limiting
app.post("/auth/login", authLimiter, function(req, res) {
  var email = req.body.email;
  var password = req.body.password;

  // Fake authentication
  if (email === "[email protected]" && password === "correct-password") {
    return res.json({
      token: "jwt-token-here",
      expiresIn: 3600
    });
  }

  res.status(401).json({ error: "Invalid credentials" });
});

// API routes with tiered limiting (auth middleware runs first)
app.use("/api", authMiddleware, tieredApiLimiter);

app.get("/api/users", function(req, res) {
  res.json({
    users: [
      { id: 1, name: "Alice" },
      { id: 2, name: "Bob" }
    ]
  });
});

app.get("/api/products", function(req, res) {
  res.json({
    products: [
      { id: 1, name: "Widget", price: 9.99 },
      { id: 2, name: "Gadget", price: 24.99 }
    ]
  });
});

// Health check (no rate limit)
app.get("/health", function(req, res) {
  res.json({
    status: "ok",
    redis: redisAvailable ? "connected" : "disconnected",
    uptime: process.uptime()
  });
});

// ──────────────────────────────────────────────
// Start Server
// ──────────────────────────────────────────────
var PORT = process.env.PORT || 3000;
app.listen(PORT, function() {
  console.log("API server running on port " + PORT);
  console.log("Rate limit tiers:", JSON.stringify(tierLimits, null, 2));
});

Test it:

# Successful request as anonymous user (20/min limit)
curl -i http://localhost:3000/api/users

# Request as a pro-tier user (500/min limit)
curl -i -H "X-Api-Key: key-pro-789" http://localhost:3000/api/users

# Hammer the auth endpoint to trigger lockout
for i in $(seq 1 10); do
  echo "Attempt $i: $(curl -s -o /dev/null -w '%{http_code}' \
    -X POST -H 'Content-Type: application/json' \
    -d '{"email":"[email protected]","password":"wrong"}' \
    http://localhost:3000/auth/login)"
done

Common Issues and Troubleshooting

1. Every Client Gets Rate Limited Immediately

Error: All clients share the same rate limit counter
Symptom: Even first-time visitors get 429 responses

Cause: Your app is behind a reverse proxy but trust proxy is not configured. Every client has the same req.ip (the proxy's IP).

Fix:

app.set("trust proxy", 1);

Verify by logging req.ip — it should show different IPs for different clients, not your proxy's internal IP.

2. Rate Limits Reset on Server Restart

Symptom: Deploying a new version resets all rate limit counters
Cause: Using the default in-memory MemoryStore

Fix: Switch to Redis-backed storage. The counters persist across restarts and are shared across instances.

3. Redis Connection Errors Crash the App

Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED 127.0.0.1:6379

Cause: Redis is not running or is unreachable, and the app has no fallback.

Fix: Use the graceful degradation pattern shown earlier. Set enableOfflineQueue: false on ioredis so requests do not queue up indefinitely, and fall back to MemoryStore when Redis is down.

4. Rate Limit Headers Not Appearing in Response

Symptom: curl -i shows no RateLimit-* headers
Cause: standardHeaders is not enabled, or legacyHeaders is overriding them

Fix:

var limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 100,
  standardHeaders: true,    // MUST be true for RateLimit-* headers
  legacyHeaders: false       // set false to avoid duplicate X-RateLimit-* headers
});

Also check that CORS middleware is exposing these headers to browser clients:

var cors = require("cors");
app.use(cors({
  exposedHeaders: ["RateLimit-Limit", "RateLimit-Remaining", "RateLimit-Reset", "Retry-After"]
}));

5. skipSuccessfulRequests Not Working as Expected

Symptom: Rate limit counter increases even for successful 200 responses
Cause: skipSuccessfulRequests counts the response status code, not application logic

skipSuccessfulRequests: true skips incrementing the counter when the response status code is < 400. If your authentication endpoint returns 200 for failed logins (a common anti-pattern), the limiter will skip those too. Make sure your endpoint returns proper 401/403 status codes for failures.

6. Multiple Rate Limiters Interfering with Each Other

Symptom: Hitting the global limit when you expected only the route-specific limit to apply

When you stack multiple limiters (global + per-route), they each maintain separate counters. A request can be blocked by any of them. Check which limiter is triggering by examining the response body — each limiter should have a distinct error message.

Best Practices

Always use Redis (or another external store) in production. In-memory stores fail silently when you scale to multiple instances. By the time you notice, abusive clients have already gotten through.
Set different limits for different endpoint types. Authentication endpoints need the strictest limits (5-10 per 15 minutes). CRUD API endpoints need moderate limits (60-100 per minute). Read-only public endpoints can be more generous (300+ per 15 minutes).
Always send Retry-After headers with 429 responses. Well-behaved clients will respect this header and back off. Without it, clients retry immediately and make the problem worse.
Log rate limit events. When a client hits a rate limit, log their IP, API key, the endpoint, and the current count. This data is invaluable for tuning your limits and identifying abuse patterns.
Do not rate limit health check endpoints. Your monitoring system needs to reach /health or /healthz at all times. Accidentally rate limiting your health checks can cause false alerts or prevent auto-scaling.
Test your rate limits with actual traffic patterns. Set up a staging environment with production-like rate limits and run load tests. What seems reasonable in development often needs adjustment when you see real user behavior.
Use API keys as the primary identifier, not IP addresses. IP-based limiting punishes users behind shared NATs and is easily circumvented by rotating IPs. API key-based limiting ties the limit to the actual consumer.
Start with generous limits and tighten gradually. It is much easier to tighten limits than to explain to angry customers why their integration broke because you set the limit too low on day one.
Document your rate limits publicly. Include the limits in your API documentation, explain the headers, and provide code examples for handling 429 responses. Your API consumers will thank you.
Consider implementing a "burst" allowance. Allow short bursts above the sustained rate (token bucket algorithm) for APIs where clients legitimately need to send a batch of requests occasionally.

Rate Limiting Express.js APIs

Prerequisites

Why Rate Limiting Matters

Brute Force Prevention

Cost Control

Fair Usage

Rate Limiting Algorithms Explained

Fixed Window

Sliding Window

Token Bucket

Leaky Bucket

Basic Rate Limiting with express-rate-limit

Configuring Windows, Max Requests, and Response Messages

Per-Route Rate Limiting

Identifying Clients

IP-Based Identification

API Key-Based Identification

User-Based Identification

Rate Limiting Behind Reverse Proxies

Configuring trust proxy

DigitalOcean App Platform

AWS (ALB + CloudFront)

Storing Rate Limit State

In-Memory (Default)

Redis-Backed Storage

Graceful Degradation When Redis Is Down

Rate Limit Headers

Graceful Handling on the Client Side

Tiered Rate Limits (Free vs. Paid Plans)

Rate Limiting WebSocket Connections

DDoS vs. Application-Level Rate Limiting

Testing Rate Limits

Unit Testing with Supertest

Load Testing with curl

Verifying Headers

Complete Working Example

Common Issues and Troubleshooting

1. Every Client Gets Rate Limited Immediately

2. Rate Limits Reset on Server Restart

3. Redis Connection Errors Crash the App

4. Rate Limit Headers Not Appearing in Response

5. skipSuccessfulRequests Not Working as Expected

6. Multiple Rate Limiters Interfering with Each Other

Best Practices

References

Quick Links

Recommended Reading

Retrieval Augmented Generation with Node.js

Need Expert Help?