NGINX Tutorial: Protect Kubernetes APIs with Rate Limiting

Note: This tutorial is part of Microservices March 2022: Kubernetes Networking.

Reduce Kubernetes Latency with Autoscaling

Protect Kubernetes APIs with Rate Limiting (this post)

Protect Kubernetes Apps from SQL Injection (coming soon)

Improve Uptime and Resilience with a Canary Deployment (coming soon)

Your organization just launched its first app and API in Kubernetes. You’ve been told to expect high traffic volumes (and already implemented autoscaling to ensure NGINX Ingress Controller can quickly route the traffic), but there are concerns that the API may be targeted by a malicious attack. If the API receives a high volume of HTTP requests – a possibility with brute‑force password guessing or DDoS attacks – then both the API and app could be overwhelmed and might even crash.

But you’re in luck! The traffic control technique “rate limiting” is an API gateway use case that limits the incoming request rate to a value typical for real users. You configure NGINX Ingress Controller to implement a rate limiting policy, which prevents the app and API from getting overwhelmed by too many requests. Nice work!

Lab and Tutorial Overview

This blog accompanies the lab for Unit 2 of Microservices March 2022 – Exposing APIs in Kubernetes – but you can also use it as a tutorial in your own environment (get the examples from our GitHub repo). It demonstrates how to use multiple NGINX Ingress Controllers combined with enable rate limiting to prevent apps and APIs from getting overwhelmed.

The easiest way to do the lab is to register for Microservices March 2022 and use the browser-based lab that’s provided. If you want to do it as a tutorial in your own environment, you need a machine with:

2 CPUs or more
2 GB of free memory
20 GB of free disk space
Internet connection
Container or virtual machine manager, such as Docker, Hyperkit, Hyper-V, KVM, Parallels, Podman, VirtualBox, or VMware Fusion/Workstation
minikube installed
Helm installed

Note: This blog is written for minikube running on a desktop/laptop that can launch a browser window. If you’re in an environment where that’s not possible, then you’ll need to troubleshoot how to get to the services via a browser.

To get the most out of the lab and tutorial, we recommend that before beginning you:

Watch the recording of the livestreamed conceptual overview
Review the background blogs, webinar, and video
Watch the 18-minute video summary of the lab:

This tutorial uses these technologies:

NGINX Ingress Controller (based on NGINX Open Source)
Helm
KEDA
Locust
minikube
Podinfo
Prometheus

This tutorial includes three challenges:

Deploy a Cluster, App, API, and Ingress Controller
Overwhelm Your App and API
Save Your App and API with Dual Ingress Controller and Rate Limiting

Challenge 1: Deploy a Cluster, App, API, and Ingress Controller

In this challenge, you will deploy a minikube cluster and install Podinfo as a sample app and API.

Create a Minikube Cluster

Create a minikube cluster. After a few seconds, a message confirms the deployment was successful.

$ minikube start 
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Install the Podinfo App and Podinfo API

Step 1: Create a Deployment

Podinfo is a “web application made with Go that showcases best practices of running microservices in Kubernetes”. We’re using it as a sample app and API because of its small footprint.

Using the text editor of your choice, create a YAML file called 1-apps.yaml with the following contents. It defines a Deployment that includes:

A web app called “Podinfo Frontend” that renders an HTML page
An API called “Podinfo API” that returns a JSON payload

apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: api 
spec: 
  selector: 
    matchLabels: 
      app: api 
  template: 
    metadata: 
      labels: 
        app: api 
    spec: 
      containers: 
        - name: api 
          image: stefanprodan/podinfo 
          ports: 
            - containerPort: 9898 
--- 
apiVersion: v1 
kind: Service 
metadata: 
  name: api 
spec: 
  ports: 
    - port: 80 
      targetPort: 9898 
      nodePort: 30001 
  selector: 
    app: api 
  type: LoadBalancer 
--- 
apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: frontend 
spec: 
  selector: 
    matchLabels: 
      app: frontend 
  template: 
    metadata: 
      labels: 
        app: frontend 
    spec: 
      containers: 
        - name: frontend 
          image: stefanprodan/podinfo 
          ports: 
            - containerPort: 9898 
--- 
apiVersion: v1 
kind: Service 
metadata: 
  name: frontend 
spec: 
  ports: 
    - port: 80 
      targetPort: 9898 
      nodePort: 30002 
  selector: 
    app: frontend 
  type: LoadBalancer

Deploy the app and API:

$ kubectl apply -f 1-apps.yaml
deployment.apps/api created 
service/api created 
deployment.apps/frontend created 
service/frontend created

Confirm that the Podinfo pods deployed, as indicated by the value Running in the STATUS column.

$ kubectl get pods  
NAME                       	READY	STATUS    RESTARTS	AGE 
api-7574cf7568-c6tr6        1/1     Running   0          87s 
frontend-6688d86fc6-78qn7   1/1     Running   0          87s

Deploy NGINX Ingress Controller

The fastest way to install NGINX Ingress Controller is with Helm.

Install NGINX Ingress Controller in a separate namespace (“nginx”) using Helm.

Create the namespace:

 kubectl create namespace nginx

Add the NGINX repository to Helm:

 helm repo add nginx-stable https://helm.nginx.com/stable

Download and install NGINX Ingress Controller in your cluster:

 helm install main nginx-stable/nginx-ingress \ 
 --set controller.watchIngressWithoutClass=true \ 
 --set controller.ingressClass=nginx \ 
 --set controller.service.type=NodePort \ 
 --set controller.service.httpPort.nodePort=30010 \ 
 --set controller.enablePreviewPolicies=true \ 
 --namespace nginx

Confirm that the NGINX Ingress Controller pod deployed, as indicated by the value Running in the STATUS column.

$ kubectl get pods –namespace nginx 
NAME                                  READY   STATUS    RESTARTS   AGE 
main-nginx-ingress-779b74bb8b-d4qtc   1/1     Running   0          92s

Route Traffic to Your App

Using the text editor of your choice, create a YAML file called 2-ingress.yaml with the following contents. It defines the Ingress manifest required to route traffic to the app and API.

apiVersion: networking.k8s.io/v1 
kind: Ingress 
metadata: 
  name: first 
spec: 
  ingressClassName: nginx 
  rules: 
    - host: "example.com" 
      http: 
        paths: 
          - backend: 
              service: 
                name: frontend 
                port: 
                  number: 80 
            path: / 
            pathType: Prefix 
    - host: "api.example.com" 
      http: 
        paths: 
          - backend: 
              service: 
                name: api 
                port: 
                  number: 80 
            path: / 
            pathType: Prefix

Deploy the Ingress resource:

$ kubectl apply -f 2-ingress.yaml 
ingress.networking.k8s.io/first created

Test the Ingress Configuration

Create a temporary pod:

To ensure your Ingress configuration is performing as expected, you’ll test it using a temporary pod. Launch a disposable busybox pod in the cluster:

kubectl run -ti --rm=true busybox --image=busybox 
If you don't see a command prompt, try pressing enter. 
/ #

Test the Podinfo API:

Issue a request to the NGINX Ingress Controller pod with the hostname “api.example.com”:

wget --header="Host: api.example.com" -qO- main-nginx-ingress.nginx

If your API is properly receiving traffic, you’ll get:

{ 
  "hostname": "api-687fd448f8-t7hqk", 
  "version": "6.0.3", 
  "revision": "", 
  "color": "#34577c", 
  "logo": "https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif", 
  "message": "greetings from podinfo v6.0.3", 
  "goos": "linux", 
  "goarch": "arm64", 
  "runtime": "go1.16.9", 
  "num_goroutine": "6", 
  "num_cpu": "4" 
}

Test Podinfo Frontend

Use the same busybox to simulate a web browser and retrieve the webpage by entering:

wget --header="Host: example.com" --header="User-Agent: Mozilla" -qO- main-nginx-ingress.nginx

If successful, you’ll see a long response that begins with:

<!DOCTYPE html> 
<html> 
<head> 
  <title>frontend-596d5c9ff4-xkbdc</title>

Use minikube service frontend to open the Podinfo app in a browser. You should see the welcome page.

Congratulations! NGINX Ingress Controller is receiving requests and dispatching to the app and API.

End busybox session by typing exit at the command prompt of the temporary pod to return to the Kubernetes server.

Challenge 2: Overwhelm Your App and API

In this challenge, you will use Locust, an open source load-testing tool, to simulate a traffic surge that overwhelms the API and causes the app to crash.

Install Locust

First, we have to deploy the traffic generator Locust so we can watch the app and API respond to traffic.

Using the text editor of your choice, create a YAML file called 3-locust.yaml with the following contents. The Deployment and Service objects define the Locust pod. The ConfigMap object defines a script called locustfile.py which generates requests to be sent to the pod, complete with the correct headers. The traffic will not be distributed evenly between the app and API – requests are skewed to Podinfo API, with only 1/5 requests going to Podinfo Frontend.

apiVersion: v1 
kind: ConfigMap 
metadata: 
  name: locust-script 
data: 
  locustfile.py: |- 
    from locust import HttpUser, task, between 

    class QuickstartUser(HttpUser): 
        wait_time = between(0.7, 1.3) 

        @task(1) 
        def visit_website(self): 
            with self.client.get("/", headers={"Host": "example.com", "User-Agent": "Mozilla"}, timeout=0.2, catch_response=True) as response: 
                if response.request_meta["response_time"] > 200: 
                    response.failure("Frontend failed") 
                else: 
                    response.success() 
  

        @task(5) 
        def visit_api(self): 
            with self.client.get("/", headers={"Host": "api.example.com"}, timeout=0.2) as response: 
                if response.request_meta["response_time"] > 200: 
                    response.failure("API failed") 
                else: 
                    response.success() 
--- 
apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: locust 
spec: 
  selector: 
    matchLabels: 
      app: locust 
  template: 
    metadata: 
      labels: 
        app: locust 
    spec: 
      containers: 
        - name: locust 
          image: locustio/locust 
          ports: 
            - containerPort: 8089 
          volumeMounts: 
            - mountPath: /home/locust 
              name: locust-script 
      volumes: 
        - name: locust-script 
          configMap: 
            name: locust-script 
--- 
apiVersion: v1 
kind: Service 
metadata: 
  name: locust 
spec: 
  ports: 
    - port: 8089 
      targetPort: 8089 
      nodePort: 30015 
  selector: 
    app: locust 
  type: LoadBalancer 
Locust reads the `locustfile.py`, which is stored in a ConfigMap: 
from locust import HttpUser, task, between 
class QuickstartUser(HttpUser): 
    wait_time = between(0.7, 1.3) 

    @task(1) 
    def visit_website(self): 
        with self.client.get("/", headers={"Host": "example.com", "User-Agent": "Mozilla"}, timeout=0.2, catch_response=True) as response: 
            if response.request_meta["response_time"] > 200: 
                response.failure("Frontend failed") 
            else: 
                response.success() 
    @task(5) 
    def visit_api(self): 
        with self.client.get("/", headers={"Host": "api.example.com"}, timeout=0.2) as response: 
            if response.request_meta["response_time"] > 200: 
                response.failure("API failed") 
            else: 
                response.success()

Deploy Locust:

$ kubectl apply -f  3-locust.yaml 
configmap/locust-script created 
deployment.apps/locust created 
service/locust created

Confirm Locust deployment by retrieving your pods using kubectl get pods. The Locust pod is likely still in "ContainerCreating" at this point. Wait for it to display “Running” before continuing to the next section.

NAME                        READY   STATUS              RESTARTS   AGE 
api-7574cf7568-c6tr6        1/1     Running             0          33m 
frontend-6688d86fc6-78qn7   1/1     Running             0          33m 
locust-77c699c94d-hc76t     0/1     ContainerCreating   0          4s

Simulate a Traffic Surge and Observe the Effect on Performance

Use minikube service locust to open Locust in a browser.
Enter the following values in the fields:

Number of users – 1000
Spawn rate – 30
Host – http://main-nginx-ingress

Click the Start swarming button to send traffic to the Podinfo app and API

Charts Tab: This tab provides a graphical depiction of the traffic. As API requests increase, watch your Podinfo API response times worsen.
Failures Tab: Because your web app and API share an Ingress controller, notice the web app soon returns errors and fails as a result of the increased API requests.

This is problematic because a single bad actor using the API could take down all apps served by the same Ingress controller – not just the API!

Challenge 3: Save Your App and API with Dual Ingress Controller and Rate Limiting

In the final challenge, you will resolve the limitations of the previous deployment.

First, let’s look at how to address the problems with architecture. In the previous challenge, you overwhelmed NGINX Ingress Controller with API requests, which also impacted your app Podinfo Frontend. This happened because a single Ingress controller was responsible for routing traffic to the app and API.

Running a separate NGINX Ingress Controller pod for each of your services prevents your app from being impacted by too many API requests. This isn’t necessarily required for every use case, but in our simulation it’s easy to see the benefits of running multiple Ingress controllers.

The second part of the solution, which will prevent the Podinfo API from getting overwhelmed, is to implement rate limiting by using NGINX Ingress Controller as an API gateway.

What is Rate Limiting?

Rate limiting restricts the number of requests a user can make in a given time period. When under a DDoS attack, for example, you can use rate limiting to limit the incoming request rate to a value typical for real users. When rate limiting is implemented with NGINX, any clients who submit too many requests will get redirected to an error page so they cannot negatively impact the API. Learn how this works in the NGINX Ingress Controller documentation.

What is an API Gateway?

An API gateway routes API requests from a client to the appropriate services. A big misunderstanding about this simple definition is the idea that an API gateway is a unique piece of technology. It’s not. Rather, “API gateway” describes a set of use cases that can be implemented via different types of proxies – most commonly an ADC or load balancer and reverse proxy, and increasingly an Ingress controller or service mesh. Rate limiting is a common use case for deploying an API gateway. Learn more about API gateway use cases in Kubernetes in the blog How Do I Choose? API Gateway vs. Ingress Controller vs. Service Mesh.

Prepare Your Cluster

Delete the Ingress definition:

Before you can implement this new architecture and policy, you must delete the original Ingress definition:

kubectl delete -f 2-ingress.yaml 
ingress.networking.k8s.io "first" deleted

Create two namespaces:

Now, create two new namespaces – one for each NGINX Ingress Controller.

Serving Podinfo Frontend app:

kubectl create namespace nginx-web 
namespace/nginx-web created

Serving Podinfo API:

kubectl create namespace nginx-api 
namespace/nginx-api created

Install the nginx-web NGINX Ingress Controller

Install NGINX Ingress Controller:

This NGINX Ingress Controller pod serves requests to Podinfo Frontend.

helm install web nginx-stable/nginx-ingress  
  --set controller.ingressClass=nginx-web \ 
  --set controller.service.type=NodePort \ 
  --set controller.service.httpPort.nodePort=30020 \ 
  --namespace nginx-web

Create the Ingress manifest:

Create an Ingress manifest called 4-ingress-web.yaml for the Podinfo Frontend app.

apiVersion: networking.k8s.io/v1 
kind: Ingress 
metadata: 
  name: frontend 
spec: 
  ingressClassName: nginx-web 
  rules: 
    - host: "example.com" 
      http: 
        paths: 
          - backend: 
              service: 
                name: frontend 
                port: 
                  number: 80 
            path: / 
            pathType: Prefix

Deploy the new manifest:

kubectl apply -f 4-ingress-web.yaml 
ingress.networking.k8s.io/frontend created

Install the nginx-api NGINX Ingress Controller

The manifest you created in the last section is exclusively for the nginx-web Ingress controller, as pointed out in the ingressClassName field. Next, you will install an Ingress controller for Podinfo API, including a rate limiting policy to prevent your API from getting overwhelmed.

There are two ways to configure rate limiting with NGINX Ingress Controller:

Option 1: NGINX Ingress Resources

NGINX Ingress resources are an alternative to Kubernetes custom resources. They provide a native, type‑safe, and indented configuration style which simplifies implementation of Ingress load balancing capabilities, including:

Circuit breaking – For appropriate handling of application errors.
Sophisticated routing – For A/B testing and blue‑green deployments.
Header manipulation – For offloading application logic to the NGINX Ingress controller.
Mutual TLS authentication (mTLS) – For zero‑trust or identity‑based security.
Web application firewall (WAF) – For protection against HTTP vulnerability attacks.

Option 2: Snippets

While snippets would work for this tutorial, we recommend avoiding snippets whenever possible because they’re error-prone, difficult to work with, don’t provide fine-grain control, and can have security issues.

This tutorial uses NGINX Ingress resources, which offer numerous configuration options for rate limiting. In this challenge, you will use just the three required rate limiting parameters:

Rate: The rate of requests permitted. The rate is specified in requests per second (r/s) or requests per minute (r/m).
Key: The key to which the rate limit is applied. Can contain text, variables, or a combination of them.
Zone Size: Size of the shared memory zone – configurable in MB or KB. It’s required for NGINX worker processes to keep track of requests.

Let’s look at those parameters in an example. Here you can see that clients are restricted to 1 request per second, based on the NGINX variable ${binary_remote_addr} which instructs NGINX Ingress Controller to limit based on unique IP address.

    rateLimit: 
      rate: 1r/s 
      key: ${binary_remote_addr} 
      zoneSize: 10M

Install NGINX Ingress Controller:

This NGINX Ingress Controller pod will serve requests to Podinfo API.

helm install api nginx-stable/nginx-ingress  
  --set controller.ingressClass=nginx-api \ 
  --set controller.service.type=NodePort \ 
  --set controller.service.httpPort.nodePort=30030 \ 
  --set controller.enablePreviewPolicies=true \ 
  --namespace nginx-api

Create the Ingress manifest:

Create an Ingress controller called 5-ingress-api.yaml

apiVersion: k8s.nginx.org/v1 
kind: Policy 
metadata: 
  name: rate-limit-policy 
spec: 
  rateLimit: 
    rate: 10r/s 
    key: ${binary_remote_addr} 
    zoneSize: 10M 
--- 
apiVersion: k8s.nginx.org/v1 
kind: VirtualServer 
metadata: 
  name: api-vs 
spec: 
  ingressClassName: nginx-api 
  host: api.example.com 
  policies: 
  - name: rate-limit-policy 
  upstreams: 
  - name: api 
    service: api 
    port: 80 
  routes: 
  - path: / 
    action: 
      pass: api

Deploy the new manifest:

kubectl apply -f 5-ingress-api.yaml 
ingress.networking.k8s.io/api created

Reconfigure Locust

Now, repeat the Locust experiment. You’ll look for two changes:

Podinfo API doesn’t get overloaded.
No matter how many requests are sent to Podinfo API, there will be no impact on Podinfo Frontend.

Change the Locust script so that:

All the requests to the web app go through nginx-web (http://web-nginx-ingress.nginx-web)
All API requests go to nginx-api (http://web-nginx-ingress.nginx-web)

Because Locust supports a single URL in the dashboard, you will hardcode the value in the Python script using a YAML file 6-locust.yaml. Take note of the URLs in each “task”.

apiVersion: v1 
kind: ConfigMap 
metadata: 
  name: locust-script 
data: 
  locustfile.py: |- 
    from locust import HttpUser, task, between 

    class QuickstartUser(HttpUser): 
        wait_time = between(0.7, 1.3) 

        @task(1) 
        def visit_website(self): 
            with self.client.get("http://web-nginx-ingress.nginx-web/", headers={"Host": "example.com", "User-Agent": "Mozilla"}, timeout=0.2, catch_response=True) as response: 
                if response.request_meta["response_time"] > 200: 
                    response.failure("Frontend failed") 
                else: 
                    response.success() 
  

        @task(5) 
        def visit_api(self): 
            with self.client.get("http://api-nginx-ingress.nginx-api/", headers={"Host": "api.example.com"}, timeout=0.2) as response: 
                if response.request_meta["response_time"] > 200: 
                    response.failure("API failed") 
                else: 
                    response.success() 
--- 
apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: locust 
spec: 
  selector: 
    matchLabels: 
      app: locust 
  template: 
    metadata: 
      labels: 
        app: locust 
    spec: 
      containers: 
        - name: locust 
          image: locustio/locust 
          ports: 
            - containerPort: 8089 
          volumeMounts: 
            - mountPath: /home/locust 
              name: locust-script 
      volumes: 
        - name: locust-script 
          configMap: 
            name: locust-script 
--- 
apiVersion: v1 
kind: Service 
metadata: 
  name: locust 
spec: 
  ports: 
    - port: 8089 
      targetPort: 8089 
      nodePort: 30015 
  selector: 
    app: locust 
  type: LoadBalancer

Implement the script change:

Submit the new YAML file. You should get this output confirming the script was changed (“configured”) and the rest remains unchanged.

kubectl apply -f 6-locust.yaml 
configmap/locust-script configured 
deployment.apps/locust unchanged 
service/locust unchanged

Force a reload:

Delete the Locust pods to force a reload of the new config map. The following command simultaneously retrieves your pods and deletes the Locust pod. It includes some Linux commands and pipes.

kubectl delete pod `kubectl get pods | grep locust | awk {'print $1'}`

Last, retrieve the pods and confirm Locust has been reloaded. You should see that the Locust pod age is just a few seconds old.


kubectl get pods 
NAME                        READY   STATUS    RESTARTS   AGE 
api-7574cf7568-jrlvd        1/1     Running   0          9m57s 
frontend-6688d86fc6-vd856   1/1     Running   0          9m57s 
locust-77c699c94d-6chsg     1/1     Running   0          6s

Test Rate Limiting

Return to Locust and change the parameters:

Number of users – 400
Spawn rate – 10
Host – http://main-nginx-ingress

Click the Start swarming button to send traffic to the Podinfo app and API.

You will see that as the number of users starts to climb, so does the error rate. However, when you click on failures in the Locust UI, you will see that these are no longer the Web frontend, but are coming from the API service. Also notice that NGINX is serving back a ‘Service Temporarily Unavailable’ message. This is part of the rate limiting feature and it can be customized. The API is rate limited, and the web application is always available. Well done!

Next Steps

In the real world, rate limiting alone won’t protect your apps and APIs from bad actors. Your security requirements probably include at least one or two of the following methods for protecting Kubernetes apps, APIs, and infrastructure:

Authentication and authorization
Web application firewall and DDoS protection
End-to-end encryption and Zero Trust
Compliance with industry regulations

We cover these topics and more in Unit 3: Microservices Security Pattern.

You can use this blog to implement the tutorial in your own environment or try it out in our browser-based lab (register here). To learn more on the topic of exposing Kubernetes services, follow along with the other activities in Unit 2: Exposing APIs in Kubernetes:

Watch the high-level overview webinar
Review the collection of technical blogs and videos

To try NGINX Ingress Controller for Kubernetes with NGINX Plus and NGINX App Protect, start your free 30-day trial today or contact us to discuss your use cases.

To try NGINX Ingress Controller with NGINX Open Source, you can obtain the release source code, or download a prebuilt container from DockerHub.