cloud infrastructure

NGINX API Gateway: Production Configs for Load Balancing, Rate Limiting, and SSL (2026)

The ingress-nginx controller retires March 2026. Your standalone NGINX gateway configs still work. Here's the complete setup with rate limiting, SSL, and Docker Compose.

4916506a-cf4d-4427-b868-6f019dc45c4c.jpg The Kubernetes Ingress NGINX Controller retires on March 31, 2026. No more releases, no bug fixes, no security patches. If you're running NGINX in Kubernetes, you need a migration plan.

If you're running NGINX as a standalone API gateway on VMs, containers, or bare metal: nothing changes. Your configs still work. NGINX still handles tens of thousands of concurrent connections with minimal memory. It's still the best lightweight option for routing traffic to your microservices.

The CIOs Guide to MCP: How Model Context Protocol Connects AI to Your Enterprise and Why It Matters

The CIOs Guide to MCP: How Model Context Protocol Connects AI to Your Enterprise and Why It Matters

Stop building custom AI integrations. MCP is the universal standard adopted by Anthropic, OpenAI, Google, Microsoft. CIO guide to enterprise adoption.

Learn More

This guide covers both sides of that equation: production-ready standalone configs (routing, load balancing, rate limiting, SSL, caching, CORS, versioning, Docker Compose) and the Kubernetes migration path from the retired ingress-nginx controller to the Gateway API. I've been using NGINX as an API gateway since the early days of microservices adoption, and I've watched the landscape shift from simple reverse proxy setups to full service meshes. Here's what actually matters in 2026.

Why NGINX Still Wins as an API Gateway

An API gateway sits between your clients and your backend services. Instead of your frontend making requests to five different microservice URLs, it talks to one gateway that routes everything internally.

NGINX remains a top choice for this role for four reasons:

  • Performance: NGINX handles tens of thousands of concurrent connections with minimal memory overhead. It's event-driven, non-blocking, and battle-tested at massive scale.
  • Simplicity: Your gateway configuration is a text file. No complex frameworks, no SDK dependencies, no vendor lock-in.
  • Flexibility: Reverse proxy, load balancer, SSL termination, rate limiting, caching: NGINX does all of it in a single process.
  • Maturity: NGINX has been in production at companies like Netflix, Cloudflare, and Dropbox for over a decade. The failure modes are well understood.

Dedicated API gateway products like Kong, Tyk, and AWS API Gateway offer more features out of the box (developer portals, analytics dashboards, OAuth integration), but they also add complexity, cost, and another dependency. For many teams, NGINX gives you exactly what you need without the overhead.

The Ingress NGINX Retirement: What It Does and Doesn't Affect

In November 2025, the Kubernetes SIG Network and Security Response Committee announced that the Ingress NGINX Controller would be retired on March 31, 2026. This is the default ingress controller that ships with most Kubernetes distributions.

Here's what matters:

What's being retired:

  • The kubernetes/ingress-nginx project (the community-maintained Kubernetes Ingress controller)
  • InGate (the attempted replacement that never gained traction)

What is NOT being retired:

  • NGINX itself (the web server / reverse proxy)
  • NGINX Plus (F5's commercial offering)
  • NGINX Gateway Fabric (the new Kubernetes Gateway API implementation)
  • The Kubernetes Ingress API (you can still create Ingress resources)
  • Standalone NGINX used as an API gateway outside of Kubernetes

If you're running NGINX as an API gateway on a VM, a DigitalOcean droplet, or in a plain Docker container: nothing changes. Your configurations work exactly the same way.

If you're running the Ingress NGINX Controller in Kubernetes, you need a migration plan. That's covered later in this article.

Basic Installation and Your First Gateway Config

Installation

On Ubuntu/Debian:

sudo apt update
sudo apt install nginx
sudo systemctl start nginx
sudo systemctl enable nginx

On Alpine (common in Docker):

apk add nginx

Or pull the official Docker image:

docker pull nginx:1.27

Routing Three Microservices Through a Single Entry Point

Say you have three microservices: an authentication service, a users service, and an orders service. Here's the gateway configuration:

upstream auth_service {
    server 10.0.1.10:3000;
    keepalive 32;
}

upstream users_service {
    server 10.0.1.11:3001;
    keepalive 32;
}

upstream orders_service {
    server 10.0.1.12:3002;
    keepalive 32;
}

server {
    listen 80;
    server_name api.example.com;

    # Authentication service
    location /api/v1/auth/ {
        proxy_pass http://auth_service/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }

    # Users service
    location /api/v1/users/ {
        proxy_pass http://users_service/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }

    # Orders service
    location /api/v1/orders/ {
        proxy_pass http://orders_service/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }

    # Catch-all for undefined routes
    location / {
        return 404 '{"error": "API endpoint not found"}';
        add_header Content-Type application/json;
    }
}

A few things to notice:

The keepalive 32 directive tells NGINX to maintain persistent connections to your upstream services. Without this, every proxied request opens and closes a TCP connection, adding latency and hammering your backend.

The proxy_http_version 1.1 and Connection "" headers are required for keepalive connections to work properly with upstream servers.

The X-Forwarded-For and X-Real-IP headers pass the original client IP through to your services. Without these, every request looks like it comes from your NGINX server's internal IP.

Eliminating Repeated Proxy Headers with Includes

That configuration has a lot of repeated proxy headers. Clean it up with an include file:

# /etc/nginx/proxy_params
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_http_version 1.1;
proxy_set_header Connection "";

Now your location blocks become:

location /api/v1/auth/ {
    proxy_pass http://auth_service/;
    include /etc/nginx/proxy_params;
}

Three lines per service. As your architecture grows, that matters.

Load Balancing Across Multiple Service Instances

One of NGINX's strongest features as an API gateway is built-in load balancing. When you run multiple instances of a service, NGINX distributes traffic across them automatically.

Round Robin (Default)

upstream users_service {
    server 10.0.1.11:3001;
    server 10.0.1.12:3001;
    server 10.0.1.13:3001;
    keepalive 32;
}

Requests rotate evenly across all three servers.

Weighted Load Balancing

If one server is beefier than the others:

upstream users_service {
    server 10.0.1.11:3001 weight=3;
    server 10.0.1.12:3001 weight=1;
    server 10.0.1.13:3001 weight=1;
    keepalive 32;
}

Server .11 gets three times the traffic.

Least Connections

Route to whichever server currently has the fewest active connections:

upstream users_service {
    least_conn;
    server 10.0.1.11:3001;
    server 10.0.1.12:3001;
    server 10.0.1.13:3001;
    keepalive 32;
}

This is ideal when request processing times vary significantly.

Passive Health Checks and Backup Servers

You can mark a server as failed after a certain number of timeouts:

upstream users_service {
    server 10.0.1.11:3001 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:3001 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:3001 backup;
    keepalive 32;
}

If a server fails three times within 30 seconds, NGINX stops routing to it for 30 seconds. The backup server only receives traffic when all primary servers are down.

NGINX Plus (the commercial version) offers active health checks that proactively probe endpoints, but the open-source passive checks work well for most use cases.

Rate Limiting to Protect Your Services

Rate limiting at the gateway level protects your services from abuse and keeps a single client from overwhelming your infrastructure.

http {
    # Define rate limit zones
    limit_req_zone $binary_remote_addr zone=api_general:10m rate=30r/s;
    limit_req_zone $binary_remote_addr zone=auth_limit:10m rate=5r/s;

    server {
        listen 80;
        server_name api.example.com;

        # Strict limit on auth endpoints (prevent brute force)
        location /api/v1/auth/ {
            limit_req zone=auth_limit burst=10 nodelay;
            limit_req_status 429;
            proxy_pass http://auth_service/;
            include /etc/nginx/proxy_params;
        }

        # Standard limit for general API traffic
        location /api/v1/ {
            limit_req zone=api_general burst=50 nodelay;
            limit_req_status 429;
            proxy_pass http://api_backend/;
            include /etc/nginx/proxy_params;
        }
    }
}

The $binary_remote_addr key means each client IP gets its own rate limit bucket. The burst parameter allows short spikes above the rate, and nodelay processes burst requests immediately rather than queuing them.

For API key-based rate limiting, you can use a variable from a header:

limit_req_zone $http_x_api_key zone=api_key_limit:10m rate=100r/s;

This gives each API key its own rate limit, regardless of which IP the request comes from.

SSL/TLS Termination

Your API gateway should terminate SSL in production. Let your backend services communicate over plain HTTP internally while the gateway handles encryption.

server {
    listen 443 ssl http2;
    server_name api.example.com;

    ssl_certificate /etc/letsencrypt/live/api.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/api.example.com/privkey.pem;

    # Modern TLS configuration
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
    ssl_prefer_server_ciphers off;

    # HSTS
    add_header Strict-Transport-Security "max-age=63072000" always;

    location /api/v1/auth/ {
        proxy_pass http://auth_service/;
        include /etc/nginx/proxy_params;
    }

    # ... other locations
}

# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name api.example.com;
    return 301 https://$server_name$request_uri;
}

With Let's Encrypt and Certbot, you can automate certificate renewal completely. Your API clients never deal with expired certificates.

Response Caching for Read-Heavy APIs

Caching at the gateway level can dramatically reduce load on your backend services.

http {
    proxy_cache_path /var/cache/nginx/api levels=1:2
                     keys_zone=api_cache:10m
                     max_size=1g
                     inactive=60m
                     use_temp_path=off;

    server {
        listen 80;
        server_name api.example.com;

        # Cache product listings (read-heavy, changes infrequently)
        location /api/v1/products/ {
            proxy_cache api_cache;
            proxy_cache_valid 200 10m;
            proxy_cache_valid 404 1m;
            proxy_cache_key $request_uri;
            proxy_cache_use_stale error timeout updating;

            add_header X-Cache-Status $upstream_cache_status;

            proxy_pass http://products_service/;
            include /etc/nginx/proxy_params;
        }

        # Don't cache authentication or write operations
        location /api/v1/auth/ {
            proxy_no_cache 1;
            proxy_cache_bypass 1;
            proxy_pass http://auth_service/;
            include /etc/nginx/proxy_params;
        }
    }
}

The X-Cache-Status header is invaluable for debugging. It returns HIT, MISS, BYPASS, or EXPIRED so you can verify caching behavior.

Centralized CORS Handling

If your API serves browser clients, handling CORS at the gateway saves you from implementing it in every microservice:

location /api/v1/ {
    # Handle preflight requests
    if ($request_method = 'OPTIONS') {
        add_header 'Access-Control-Allow-Origin' '$http_origin';
        add_header 'Access-Control-Allow-Methods' 'GET, POST, PUT, DELETE, OPTIONS';
        add_header 'Access-Control-Allow-Headers' 'Authorization, Content-Type, X-Requested-With';
        add_header 'Access-Control-Max-Age' 86400;
        add_header 'Content-Length' 0;
        return 204;
    }

    add_header 'Access-Control-Allow-Origin' '$http_origin' always;
    add_header 'Access-Control-Allow-Credentials' 'true' always;

    proxy_pass http://api_backend/;
    include /etc/nginx/proxy_params;
}

One configuration change propagates to all your services instantly.

API Versioning Without Breaking Clients

NGINX makes API versioning straightforward. Route different versions to different backends:

# v1 routes to the legacy service
location /api/v1/users/ {
    proxy_pass http://users_service_v1/;
    include /etc/nginx/proxy_params;
}

# v2 routes to the rewritten service
location /api/v2/users/ {
    proxy_pass http://users_service_v2/;
    include /etc/nginx/proxy_params;
}

You can also use URL rewriting to keep a clean external API while your internal services evolve:

location /api/v1/inventory/ {
    rewrite ^/api/v1/inventory/(.*)$ /products/$1 break;
    proxy_pass http://products_service/;
    include /etc/nginx/proxy_params;
}

This lets you rename or restructure internal services without breaking external API contracts.

Full Docker Compose Example

Here's a production-style setup using Docker Compose with NGINX as the gateway:

services:
  gateway:
    image: nginx:1.27
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/api_gateway.conf:/etc/nginx/conf.d/default.conf
      - ./nginx/proxy_params:/etc/nginx/proxy_params
      - ./certs:/etc/nginx/certs
    depends_on:
      - auth-service
      - users-service
      - orders-service
    networks:
      - api-network

  auth-service:
    build: ./services/auth
    expose:
      - "3000"
    networks:
      - api-network

  users-service:
    build: ./services/users
    expose:
      - "3001"
    networks:
      - api-network

  orders-service:
    build: ./services/orders
    expose:
      - "3002"
    networks:
      - api-network

networks:
  api-network:
    driver: bridge

Notice that the microservices use expose (not ports). They're only accessible within the Docker network. The gateway is the only service with external port mappings. This is exactly how an API gateway should work: one entry point, everything else internal.

Your NGINX configuration uses Docker service names as hostnames:

upstream auth_service {
    server auth-service:3000;
    keepalive 16;
}

upstream users_service {
    server users-service:3001;
    keepalive 16;
}

upstream orders_service {
    server orders-service:3002;
    keepalive 16;
}

Docker's internal DNS resolves these automatically. No hardcoded IPs.

Kubernetes in 2026: Your NGINX Options After the Ingress Retirement

With the Ingress NGINX Controller retiring in March 2026, here's the landscape for NGINX-based traffic management in Kubernetes.

Option 1: NGINX Gateway Fabric (Recommended)

NGINX Gateway Fabric is the official, actively maintained implementation of the Kubernetes Gateway API using NGINX as the data plane. It's open source, backed by F5/NGINX, and is where the community is headed.

The Gateway API is the successor to the Ingress API. It uses a role-based design that separates concerns between infrastructure providers, cluster operators, and application developers. Instead of annotation-heavy Ingress resources, you get typed resources like Gateway, HTTPRoute, GRPCRoute, TCPRoute, and TLSRoute.

A basic NGINX Gateway Fabric setup:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: api-gateway
spec:
  gatewayClassName: nginx
  listeners:
    - name: http
      port: 80
      protocol: HTTP
    - name: https
      port: 443
      protocol: HTTPS
      tls:
        mode: Terminate
        certificateRefs:
          - name: api-tls-cert

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: auth-route
spec:
  parentRefs:
    - name: api-gateway
  hostnames:
    - "api.example.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /api/v1/auth
      backendRefs:
        - name: auth-service
          port: 3000

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: users-route
spec:
  parentRefs:
    - name: api-gateway
  hostnames:
    - "api.example.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /api/v1/users
      backendRefs:
        - name: users-service
          port: 3001

This is cleaner than the old annotation-based Ingress approach. Each team can manage their own HTTPRoute resources without touching the shared Gateway.

Option 2: F5 NGINX Ingress Controller (Commercial)

If you need to stay on the Ingress API, F5's commercial NGINX Ingress Controller is a completely separate codebase from the retired community project. It's actively maintained, has enterprise support, and won't be affected by the retirement.

Option 3: Non-NGINX Gateway API Implementations

If you're open to alternatives, several other projects implement the Gateway API:

  • Envoy Gateway: Community-driven, Envoy-based, strong Gateway API conformance
  • Traefik: Simple, single-binary, good for GitOps workflows
  • Cilium: eBPF-based, good if you're already using Cilium for networking
  • HAProxy: Familiar to teams already running HAProxy

Migrating from the Retired Ingress NGINX Controller

If you're currently running the deprecated Ingress NGINX Controller, here's the migration path:

  1. Audit your current Ingress resources. Document every annotation you're using, especially controller-specific ones like nginx.ingress.kubernetes.io/rewrite-target or nginx.ingress.kubernetes.io/ssl-redirect.
  2. Use the ingress2gateway tool. Kubernetes provides this conversion utility to translate Ingress resources into Gateway API resources. The 1.0 release (March 2026) includes controller-level integration tests verifying behavioral equivalence.
  3. Deploy your chosen Gateway API implementation alongside the existing controller. Run both in parallel during migration.
  4. Migrate routes incrementally. Move one service at a time, validate, then proceed.
  5. Decommission Ingress NGINX. Once all traffic flows through the new gateway.

The Gateway API handles everything the Ingress API did, plus native support for TCP/UDP routing, traffic splitting, header-based matching, and gRPC: features that required vendor-specific annotations with Ingress.

Production Hardening Checklist

Before you put your NGINX API gateway in front of real traffic, make sure you've covered these:

Security:

  • SSL/TLS with modern cipher suites (TLSv1.2+ only)
  • Rate limiting on authentication endpoints
  • Request body size limits (client_max_body_size)
  • Disable server tokens (server_tokens off)
  • Add security headers (HSTS, X-Content-Type-Options, X-Frame-Options)

Performance:

  • Enable gzip compression for JSON responses
  • Configure keepalive connections to upstream services
  • Tune worker processes and connections (worker_processes auto)
  • Set appropriate timeouts (proxy_connect_timeout, proxy_read_timeout)
  • Enable response buffering for slow clients

Observability:

  • Structured JSON access logs with request timing
  • Error logging at the appropriate level
  • Include correlation IDs ($request_id)
  • Monitor upstream response times

Resilience:

  • Health checks on upstream servers
  • Graceful degradation with proxy_next_upstream
  • Circuit breaker patterns with max_fails and fail_timeout
  • Backup servers for critical services

Here's a production-ready JSON logging configuration:

log_format api_json escape=json
    '{'
        '"time":"$time_iso8601",'
        '"request_id":"$request_id",'
        '"remote_addr":"$remote_addr",'
        '"method":"$request_method",'
        '"uri":"$request_uri",'
        '"status":$status,'
        '"body_bytes_sent":$body_bytes_sent,'
        '"request_time":$request_time,'
        '"upstream_response_time":"$upstream_response_time",'
        '"upstream_addr":"$upstream_addr",'
        '"http_user_agent":"$http_user_agent"'
    '}';

access_log /var/log/nginx/api_access.log api_json;

JSON-formatted logs make it trivial to parse with any modern log aggregation tool.

NGINX vs. Dedicated API Gateways: Choosing the Right Tool

Choose NGINX when:

  • You need a lightweight, high-performance gateway with minimal overhead
  • Your team is comfortable writing NGINX configuration
  • You want full control without vendor dependencies
  • Your API gateway needs are primarily routing, load balancing, and SSL termination
  • You're already running NGINX for other purposes

Choose a dedicated API gateway (Kong, Tyk, AWS API Gateway) when:

  • You need a developer portal with API documentation
  • You require built-in OAuth/OpenID Connect flows
  • You want a GUI for managing routes and policies
  • You need advanced analytics and monetization features
  • Your team isn't comfortable with NGINX configuration

Choose a service mesh (Istio, Linkerd) when:

  • You need service-to-service (east-west) traffic management
  • mTLS between all services is a requirement
  • You need distributed tracing built into the infrastructure layer
  • You're running hundreds of microservices in Kubernetes

Many production setups combine these: NGINX as the edge API gateway handling external traffic, with a service mesh managing internal communication.

What to Do Next

If you're running NGINX as a standalone API gateway on VMs, containers, or bare metal: keep doing what you're doing. Update your TLS configuration to TLSv1.2+ only, make sure you're rate limiting auth endpoints, and keep your configs version-controlled.

If you're running the community Ingress NGINX Controller in Kubernetes: start your migration now. The March 31, 2026 deadline is real. NGINX Gateway Fabric gives you continuity with the NGINX data plane while moving to the standardized Gateway API. Run both in parallel, migrate incrementally, and decommission the old controller once traffic is flowing cleanly.

The best API gateway is the one your team understands, can debug at 2 AM, and scales with your architecture. For a lot of us, that's still NGINX.