NGINX using 100% of the server CPU when testing with Blitz.io

I’m facing an issue here and after six days of looking around I decided
to
ask for some help here, after all a million heads can think a lot better
than one.

I have a Ubuntu 14.04 server setup with NGINX, HHVM, PHP5-FPM (as a
backup),
Percona MySQL, Memcached (which will be replaced by Redis). I have
fastcgi_cache setup for WordPress and object caching done over
memcached.
All cool and dandy in theory, but not in practice.

This is a RamNode OpenVZ SSD VPS with 2GB of RAM and an Intel Xeon E5
with
two cores for my VPS. I also tested on a KVM SSD VPS with 2GB of RAM and
an
Intel Xeon E3 with four cores for my VPS.

Running Blitz.io on it the server is getting absolutely murdered by the
NGINX worker processes, which each one using 100% CPU according to top
and
htop. I usually run with the following pattern:

--pattern 999-1000:60 https://www.geeksune.com/blog/hello-world/

That makes makes CPU go to the roof and according to Blitz.io this is
the
result of that:

135 HITS WITH 57,734 ERRORS & 234 TIMEOUTS

Obviously that isn’t good. RAM usage stay under 250MB all the time and
it
seems that all those requests from Blitz.io are hitting the cache, as
seen
here:

54.232.204.19 - HIT [23/Nov/2014:19:06:32 -0200] “GET / HTTP/1.1” 200 7632
“-” “blitz.io; [email protected]

Notice the HIT at the start. I set a new log format and added
$upstream_cache_status to it.

A similar setup on the same machine works just fine with Blitz.io, so
there
is definitely something wrong with my NGINX setup and it seems related
to
fastcgi_cache. I have the same results every time, even with just
PHP5-FPM
with Zend.

Does anyone have a clue about what is happening? My configuration files
look
like this:

Thanks in advance.

:slight_smile:

Posted at Nginx Forum:

hi,

does you errorlog tells you something?

Posted at Nginx Forum:

mex Wrote:

hi,

does you errorlog tells you something?

Nothing at all, I’m afraid. It have an error for OCSP stappling, however
disabling it doesn’t fix the issue, so I believe that isn’t it.

Posted at Nginx Forum:

Hi,

Can you provide the mailing list with the output of ‘nginx -V’ and a
debug log when a worker process is pegged at 100%. See
A debugging log for info on debug logs.

I also wonder why your supplied config has the following:

limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s ;
limit_req_status 444 ;

Are you using this anywhere?

Yichun Z. (agentzh) Wrote:

with the right tools. For example, the on-CPU flame graph tool like


nginx mailing list
[email protected]
nginx Info Page

I will look into those, thank you for the suggestion!

:slight_smile:

Robert P. Wrote:

backup),

result of that:
200 7632
PHP5-FPM
Ubuntu Pastebin

nginx mailing list
[email protected]
nginx Info Page

I was going to use those two lines for wp-login.php protection on
WordPress
later on, but I will probably remove them and create something with
fail2ban. I did removed those lines for testing, but it did not solve
the
issue.

Here is the output of nginx -V:

And this is the error.log with debug:

That is a gigantic file, but vim could handle it just fine. It seems SSL
is
being mentioned everywhere on the file, so my next test will be with
HTTP
instead of HTTPS. Can you see anything from that?

Posted at Nginx Forum:

Hello!

On Wed, Nov 26, 2014 at 11:15 AM, julianfernandes wrote:

Running Blitz.io on it the server is getting absolutely murdered by the
NGINX worker processes, which each one using 100% CPU according to top and
htop.

100% CPU usage problems are usually trivial (and also fun) to solve
with the right tools. For example, the on-CPU flame graph tool like
this:

https://github.com/openresty/nginx-systemtap-toolkit#sample-bt

Linux’s “perf top” can be useful here as well.

Well, just FYI :slight_smile:

Regards,
-agentzh

Ok, just tested it with HTTP instead of HTTPS and while it did reach
100% of
CPU usage from time to time, the average load was 50% and the Blitz.io
test
was a lot better:

  • 33,086 HITS WITH 29 ERRORS & 2,344 TIMEOUTS

Why would SSL perform so badly? I generated the key with 4096 instead of
2048, but I don’t believe that would affect CPU usage this much.

Posted at Nginx Forum:

julianfernandes Wrote:

Why would SSL perform so badly? I generated the key with 4096 instead
of 2048, but I don’t believe that would affect CPU usage this much.

http://bench.cr.yp.to/results-encrypt.html

Posted at Nginx Forum:

Hello!

On Thu, Nov 27, 2014 at 09:21:49AM -0500, julianfernandes wrote:

Ok, just tested it with HTTP instead of HTTPS and while it did reach 100% of
CPU usage from time to time, the average load was 50% and the Blitz.io test
was a lot better:

  • 33,086 HITS WITH 29 ERRORS & 2,344 TIMEOUTS

Why would SSL perform so badly? I generated the key with 4096 instead of
2048, but I don’t believe that would affect CPU usage this much.

The “openssl speed rsa” output should give you an idea how key
sizes can affect CPU usage. E.g., my laptop can do about 500 RSA
signs per second per core with 2048 bit key, but only 60 signs per
second with 4096 key:

$ openssl speed rsa

sign verify sign/s verify/s
rsa 512 bits 0.000094s 0.000008s 10585.4 119973.8
rsa 1024 bits 0.000262s 0.000020s 3810.5 49395.9
rsa 2048 bits 0.001786s 0.000058s 560.1 17351.8
rsa 4096 bits 0.014317s 0.000256s 69.8 3908.9

Even with 2048 bit keys, SSL needs way more CPU than plain HTTP,
that’s expected.


Maxim D.
http://nginx.org/

Thanks for all the help guys, I will regenerate this key with 2048 and
see
how that goes. Still think the CPU usage is too high, but I will keep
trying
to optimize my settings.

Have a nice day all =)

Posted at Nginx Forum:

maybe switch to using Wordpress Super Cache ? handled blitz.io 8000 user
stress test fine with Nginx 1.7.7, PHP-FPM 7.0.0-dev, MariaDB 10.0.x and
CentOS 7.0 = 237 million hits/day on 2GB DigitalOcean KVM VPS server
http://wordpress7.centminmod.com/74/wordpress-super-cache-benchmarks-blitz-io-load-test-237-million-hitsday/

Posted at Nginx Forum: