Using User-Agent & IP address to rate limit

addis_a · July 28, 2014, 5:36pm

Nginx novice here - after spending some time both here, reading through
other community forums, and trial and error I’m looking for confirmation
on
my current Nginx config and/or suggestions on a better Nginx config. The
end
goal is to use both the IP address and User-Agent to rate limit requests
being proxied to an external API. Currently the config sets zones with
their
respective rate limits and bursts using the IP address as the key.
Inside
the main location directive the User-Agent is read and based on the
User-Agent the URI is rewritten to the location with the appropriate
zone.

http {
include mime.types;
default_type application/octet-stream;

limit_req_zone  $binary_remote_addr zone=one:10m   rate=136r/s;
limit_req_zone  $binary_remote_addr zone=two:10m   rate=150r/s;
limit_req_zone  $binary_remote_addr zone=three:10m rate=160r/s;
limit_req_zone  $binary_remote_addr zone=four:10m  rate=30r/m;

sendfile        on;

keepalive_timeout  65;

server {
    listen       443;
    server_name  localhost;

    ssl             on;
    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;

    ssl_session_timeout  5m;
    ssl_protocols  SSLv2 SSLv3 TLSv1;
    ssl_ciphers  HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers   on;

    proxy_ssl_session_reuse off;
    large_client_header_buffers 4 32K;

    location /java {
       limit_req  zone=one   burst=140;

       log_format java '$remote_addr - $remote_user [$time_local]'

    }

    location /python {
       limit_req  zone=two   burst=140;
       #echo "You made it here with: " $request_body "and this: "

$args
"and this: " $uri "and this: " $1;

       log_format python '$remote_addr - $remote_user [$time_local]'

‘“$request” | STATUS: $status | BODY BYTES: $body_bytes_sent |’
‘“$http_referer” “$http_user_agent”| GET PARAMS: $args | REQ BODY:
$request_body’;
access_log /var/log/nginx-access.log python;
proxy_pass https://example.com/;
}

    location /etc {
       limit_req  zone=four   burst=1;

       log_format etc '$remote_addr - $remote_user [$time_local]'

‘“$request” | STATUS: $status | BODY BYTES: $body_bytes_sent |’
‘“$http_referer” “$http_user_agent”| GET PARAMS: $args | REQ BODY:
$request_body’;
access_log /var/log/nginx-access.log etc;
proxy_pass https://example.com/;
}

    location / {
        root   html;
       index  index.html index.htm;

        if ($http_user_agent = Java/1.6.0_65) {
            rewrite ^(.*)$ /java$uri last;
        }

        if ($http_user_agent = python) {
            rewrite ^(.*)$ /python$uri last;
        }

        if ($http_user_agent = "") {
            rewrite ^(.*)$ /etc$uri last;
        }

    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   html;
    }

}

}

The concern here is if there is a way to redirect the rewritten uri
without
having to break out and start processing the request again (argument
last)?
Additionally, is the setting of zone’s using the IP address as the key
the
proper way to control these different rate limiting and burst
thresholds?

Posted at Nginx Forum:

lorenanicole · July 28, 2014, 6:17pm

On Mon, Jul 28, 2014 at 5:35 PM, lorenanicole [email protected]
wrote:

The concern here is if there is a way to redirect the rewritten uri without
having to break out and start processing the request again (argument last)?
Additionally, is the setting of zone’s using the IP address as the key the
proper way to control these different rate limiting and burst thresholds?

I am no expert, but in my own eyes, I would have avoided using:
1°) A series of ‘if’ which behavior might be clumsy
2°) Redirections which kind of ‘cut the flow’

What I would have done:
1°) Defining all the log_format stuff at the http level: the sooner, the
better

2°) Replacing the if series with a map (also in http) like the
following:
map $http_user_agent $language {
Java/* java
…
“” etc
default none
}

3°) Using the output of the first map as the input of two others to
define
zone and burst amount, like:
map $language $zone {
java one
}

map $language $burst {
java 140
}

4°) Processing all the requests in the default location :
location / {
root html;
index index.html;

if($language != none) {
    limit_req zone=$zone burst=$burst;

    access_log /var/log/nginx/access.log $language;
    proxy_pass https://example.com;
}

…
}

All that is not error-proof, coming straight outta my mind with no
testing… but you get the general idea.

B. R.

lorenanicole · July 28, 2014, 6:43pm

Thanks for the prompt feedback! Yes, the continuous if directives put my
teeth on edge as well. Using a map block to introduce variables for
$zone
and $burst respectively, I tried this already and had continuous errors.
Attempting this again (per your suggestions), I have this error –

nginx: [emerg] invalid burst rate “burst=$burst” in
/usr/local/nginx/conf/nginx.conf:69

With line 69 as follows –

limit_req zone=$zone burst=$burst;

Reading the limit_req directives
(Module ngx_http_limit_req_module)
instructions
there is no information specifying that you can use a variable to set
these
values. Likewise I do not believe you can use limit_req inside the
context
of an if directive (as you can with proxy_pass for example -
Module ngx_http_proxy_module). My
attempt to read off the value of $http_user_agent and set the limit_req
inside an if block resulted in many “nginx: [emerg]” errors.

My second thought now is to use nested locations after doing a rewrite
on
the $uri and inside these nested locations set the limit_req, log the
info,
and proxy_pass along. This way the request doesn’t have to start over.

Posted at Nginx Forum:

Using User-Agent & IP address to rate limit

All that is not error-proof, coming straight outta my mind with no testing… but you get the general idea.

All that is not error-proof, coming straight outta my mind with no
testing… but you get the general idea.