Dynamic A/B Testing with NGINX Plus - NGINX

Original: https://www.nginx.com/blog/dynamic-a-b-testing-with-nginx-plus/

The key‑value store feature was introduced in NGINX Plus R13 for HTTP traffic and extended to TCP/UDP (stream) traffic in NGINX Plus R14. This feature provides an API for dynamically maintaining values that can be used as part of the NGINX Plus configuration, without requiring a reload of the configuration. There are many possible use cases for this feature and I have no doubt that our customers will find a variety of ways to take advantage it.

This blog post describes one use case, dynamically altering how the Split Clients module is used to do A/B testing.

The Key-Value Store

The NGINX Plus API can be used to maintain a set of key‑value pairs which NGINX Plus can access at runtime. For example, let’s look at the use case where you want to keep a blacklist of client IP addresses that are not allowed to access your site (or particular URLs). The key is the client IP address, which is available in the $remote_addr variable. The value is a variable or other setting that indicates whether that client IP address is blacklisted.

We name the value variable $blacklist_status and set it to 1 to indicate that the client IP address is blacklisted. To configure this, we follow these steps:

For the state file, we have previously created the /etc/nginx/state_files directory and made it writable by the unprivileged user that runs the NGINX worker processes (as defined by the user directive elsewhere in the configuration). Here we include the state parameter to the keyval_zone directive to create the file blacklist.json for storing the key‑value pairs:

keyval_zone zone=blacklist:64k state=/etc/nginx/state_files/blacklist.json;

Then, we define the key‑value pair with the keyval directive:

keyval $remote_addr $blacklist_status zone=blacklist;

To create a key‑value pair, use an HTTP POST request. For example:

# curl -id '{"10.11.12.13":1}' http://localhost/api/2/http/keyvals/blacklist

To modify the value in a key‑value pair, use an HTTP PATCH request. For example:

# curl -iX PATCH -d '{"10.11.12.13":0}' http://localhost/api/2/http/keyvals/blacklist

To remove a key‑value pair, use an HTTP PATCH request to set the value to null. For example:

# curl -iX PATCH -d '{"10.11.12.13":null}' http://localhost/api/2/http/keyvals/blacklist

The value of a key‑value pair is assigned to the $blacklist_status variable based on using $remote_addr as the key.

Split Clients for A/B Testing

The Split Clients module allows you to split incoming traffic between upstream groups based on a request characteristic of your choice. You define the split as the percentage of incoming traffic to forward to the different upstream groups. A common use case is testing the new version of an application by sending a small proportion of traffic to it and the remainder to the current version. In our example, we’re sending 5% of the traffic to the upstream group for the new version, appversion2, and the remainder (95%) to the current version, appversion1.

We’re splitting the traffic based on the client IP address in the request, so we set the split_clients directive’s first parameter to the NGINX variable $remote_addr. With the second parameter we set the variable $upstream to the name of the upstream group.

Here’s the basic configuration:

split_clients $remote_addr $upstream {
    5% appversion2;
    *  appversion1;
}

upstream appversion1 {
   # ...
}

upstream appversion2 {
   # ...
}

server {
    listen 80;
    location / {
        proxy_pass http://$upstream;
    }
}

Using the Key-Value Store with Split Clients

Prior to NGINX Plus R13, if you wanted to change the percentages for the split, you had to edit the configuration file and reload the configuration. Using the key‑value store, you simply change the percentage value stored in the key‑value pair and the split changes accordingly, without the need for a reload.

Building on the use case in the previous section, let’s say we have decided that we want NGINX Plus to support the following options for how much traffic gets sent to appversion2: 0%, 5%, 10%, 25%, 50%, and 100%. We also want to base the split on the Host header (captured in the NGINX variable $host). The following NGINX Plus configuration implements this functionality.

First we set up the key‑value store:

keyval_zone zone=split:64k state=/etc/nginx/state_files/split.json;
keyval      $host $split_level zone=split;

As mentioned for the initial use case, in an actual deployment it makes sense to base the split on a request characteristic like the client IP address, $remote_addr. In a simple test using a tool like curl, however, all the requests come from a single IP address, so there is no split to observe.

For the test, we instead base the split on a value that is more random: $request_id. To make it easy to transition the configuration from test to production, we set a new variable in the server block, $client_ip, that is set to $request_id for the test and $remote_addr for production. Then we set up the split_clients configuration.

The variable for each split percentage (split0 for 0%, split5 for 5%, and so on) is set in a separate split_clients directive:

split_clients $client_ip $split0 {
    *   appversion1;
}
split_clients $client_ip $split5 {
    5%  appversion2;
    *   appversion1;
}
split_clients $client_ip $split10 {
    10% appversion2;
    *   appversion1;
}
split_clients $client_ip $split25 {
    25% appversion2;
    *   appversion1;
}
split_clients $client_ip $split50 {
    50% appversion2;
    *   appversion1;
}
split_clients $client_ip $split100 {
    *   appversion2;
}

Now that we have the key‑value store and split_clients configured, we can set up a map to set the $upstream variable to the upstream group specified in the appropriate split variable:

map $split_level $upstream {
    0        $split0;
    5        $split5;
    10       $split10;
    25       $split25;
    50       $split50;
    100      $split100;
    default  $split0;
}

Finally, we have the rest of the configuration for the upstream groups and the virtual server. Note that we have also configured the API which is used for the key‑value store and the status dashboard. This is the new status dashboard in NGINX Plus R14:

upstream appversion1 {
    zone appversion1 64k;
    server 192.168.50.100;
    server 192.168.50.101;
}

upstream appversion2 {
    zone appversion2 64k;
    server 192.168.50.102;
    server 192.168.50.103;
}

server {
    listen 80;
    status_zone test;
    #set $client_ip $remote_addr; # Production
    set $client_ip $request_id; # For testing only

    location / {
        proxy_pass http://$upstream;
    }

    location /api {
        api write=on;
    }

    location = /dashboard.html {
        root /usr/share/nginx/html;
    }
}

Using this configuration, we can now control how the traffic is split between the appversion1 and appversion2 upstream groups by sending an API request to NGINX Plus and setting the $split_level value for a host name. For example, the following two requests can be sent to NGINX Plus so that 5% of the traffic for www.example.com is sent to the appversion2 upstream group and 25% of the traffic for www2.example.com is sent to the appversion2 upstream group:

# curl -id '{"www.example.com":5}' http://localhost/api/2/http/keyvals/split
# curl -id '{"www2.example.com":25}' http://localhost/api/2/http/keyvals/split

To change the value for www.example.com to 10:

# curl -iX PATCH -d '{"www.example.com":10}' http://localhost/api/2/http/keyvals/split

To clear a value:

# curl -iX PATCH -d '{"www.example.com":null}' http://localhost/api/2/http/keyvals/split

After each one of these requests, NGINX Plus immediately starts using the new split value.

Here is the full configuration file:

# Set up a key‑value store to specify the percentage to send to each upstream group based on the
# Host header.

keyval_zone zone=split:64k state=/etc/nginx/state_files/split.json;
keyval $host $split_level zone=split;

# For a real application you would probably use $remote_addr with split_clients. If testing from 
# just one client, $remote_addr # is always the same, so use $request_id instead to get some 
# randomness. In the server block, # $client_ip is set to either $remote_addr or $client_ip. 

split_clients $client_ip $split0 {
    *   appversion1;
}
split_clients $client_ip $split5 {
    5%  appversion2;
    *   appversion1;
}
split_clients $client_ip $split10 {
    10% appversion2;
    *   appversion1;
}
split_clients $client_ip $split25 {
    25% appversion2;
    *   appversion1;
}
split_clients $client_ip $split50 {
    50% appversion2;
    *   appversion1;
}
split_clients $client_ip $split100 {
    *   appversion2;
}

map $split_level $upstream {
    0        $split0;
    5        $split5;
    10       $split10;
    25       $split25;
    50       $split50;
    100      $split100;
    default  $split0;
}

upstream appversion1 {
    zone appversion1 64k;
    server 192.168.50.100;
    server 192.168.50.101;
}

upstream appversion2 {
    zone appversion2 64k;
    server 192.168.50.102;
    server 192.168.50.103;
}

server {
    listen 80;
    status_zone test;

    #set $client_ip $remote_addr; # Production
    set $client_ip $request_id; # Testing only

    location / {
        proxy_pass http://$upstream;
    }

    # Configure the API and Status Dashboard. For production, access can be restricted with the 
    # allow and deny directives.
    location /api {
        api write=on;
    }

    location = /dashboard.html {
        root /usr/share/nginx/html;
    }
}

Conclusion

This is just one example of what you can do with the key‑value store. You can use a similar approach for request‑rate limiting, bandwidth limiting, or connection limiting. Dynamic IP Blacklisting with NGINX Plus and fail2ban discusses a detailed IP address blacklisting use case, and there are many other possibilities.

If you don’t already have NGINX Plus, start your free 30‑day trial and give it a try.

Retrieved by Nick Shadrin from nginx.com website.