I am trying to figure out if there is any way to rate limit all traffic
except Googlebot, msnbot, yandex and baidu bots. Here is what I have
started with:
I can add googlebot, msnbot, yandex and baidu IP ranges manually to the
whitelist, but that will make lookup table big. I am not sure whether
this
approach will work for high traffic like - 1200 requests/second
distributed
across 20 nginx hosts. Any ideas on such setup will be really helpful.
Also, can such host lookups be done in real-time for every request? I am
guessing that may not be efficient for each request, but I was wondering
if
there are any solutions.
It will not work as you expect.
Geo does not support variables in values.
You need something like this:
geo $whitelist {
default 0;
127.0.0.1 1;
…
}
map $whitelist $rate_limit_ip {
default $binary_remote_addr;
1 “”;
}
I can add googlebot, msnbot, yandex and baidu IP ranges manually to the
whitelist, but that will make lookup table big. I am not sure whether
this approach will work for high traffic like - 1200 requests/second
distributed across 20 nginx hosts. Any ideas on such setup will be
really helpful.
Nginx parses and loads this data into radix tree in memory on startup.
Also, can such host lookups be done in real-time for every request? I am
guessing that may not be efficient for each request, but I was wondering if
there are any solutions.
All variables are evaluated when they are used in request.
I am not sure how, but it’s working only with geo defining IP addresses.
I
can see HTTP 503 on client side and also ‘limiting requests, excess:
10.033
by zone’ in error logs. Nginx version: nginx/1.6.0
All variables are evaluated when they are used in request.
I was wondering if remote ip’s hostname lookup can be done before
rate-limiting it. For example, I don’t want to block IPs coming from baidu.com. Can I do such IP-hostname lookup before rate-limiting? Will
it
efficient or what are other options?
You define key “$binary_remote_addr”(string, not variable) and clients
share one limit for all.
I was wondering if remote ip’s hostname lookup can be done before
rate-limiting it. For example, I don’t want to block IPs coming from baidu.com. Can I do such IP-hostname lookup before rate-limiting? Will it
efficient or what are other options?
Nginx does not lookup remote hostnames at all.
–
WNGS-RIPE
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.