Accpet_mutex cause nginx worker balance problem

addis_a · August 3, 2014, 3:29pm

hello,

  I use ab to test performance. but when i turn on accept_mutex,  I

found the num of connection for every nginx worker is not balance. for
example,
I have 4 core and start 4 nginx worker.

ab -n 300000 -c 1000 -k “http://XXX/”

the establish connection of nginx worker

(netstat -antp|grep ESTABLISHED|awk -F" " ‘{print $7}’|sort|uniq -c)

644 24619/nginx:
1 24620/nginx:
53 24621/nginx:
302 24622/nginx:

but when I turn off accept_mutex

255 24660/nginx:
358 24661/nginx:
232 24662/nginx:
155 24663/nginx:

so if my test app is cpu bound, I will find one or two core of my 4 cpu
are
very busy but other cores are very idle

I kown accept_mutex can control accept by turn and realize worker
balance by
ngx_accept_disabled, but why " accpet_mutex off" seems control worker
balance more well ? when accept_mutex on , one or two worker can accept
connection much more than other.

thanks very much

Posted at Nginx Forum:

xinghua_hi · August 3, 2014, 9:43pm

Hello!

On Sun, Aug 03, 2014 at 09:29:19AM -0400, xinghua_hi wrote:

358 24661/nginx:
232 24662/nginx:
155 24663/nginx:

so if my test app is cpu bound, I will find one or two core of my 4 cpu are
very busy but other cores are very idle

I kown accept_mutex can control accept by turn and realize worker balance by
ngx_accept_disabled, but why " accpet_mutex off" seems control worker
balance more well ? when accept_mutex on , one or two worker can accept
connection much more than other.

With accept mutex enabled, nginx only tries to accept new
connections in one worker process (the one which was first to
become idle). This is expected to cause disbalance in tests with
small number of connections.

The ngx_accept_disabled variable is mostly unrelated and only used when
worker_connections are exhausted.

–
Maxim D.
http://nginx.org/

xinghua_hi · August 4, 2014, 4:48am

hello，

   I still can't understand why accept_mutex cause disbalance. In

code
below, multi worker will try to get mutex and the question is , why one
worker can always get the mutex ? I test many times, find that one
worker
can always accept new connection much more than others. Thanks very
much.

  if (ngx_use_accept_mutex) {
    if (ngx_accept_disabled > 0) {
        ngx_accept_disabled--;

    } else {
        if (ngx_trylock_accept_mutex(cycle) == NGX_ERROR) {
            return;
        }

        if (ngx_accept_mutex_held) {
            flags |= NGX_POST_EVENTS;

        } else {
            if (timer == NGX_TIMER_INFINITE
                || timer > ngx_accept_mutex_delay)
            {
                timer = ngx_accept_mutex_delay;
            }
        }
    }
}

Posted at Nginx Forum:

xinghua_hi · August 4, 2014, 4:58pm

Hello!

On Sun, Aug 03, 2014 at 10:47:26PM -0400, xinghua_hi wrote:

hello，
   I still can't understand why accept_mutex cause disbalance. In code
below, multi worker will try to get mutex and the question is , why one
worker can always get the mutex ? I test many times, find that one worker
can always accept new connection much more than others. Thanks very much.

Only worker which holds the accept mutex will try to accept new
connections. Other workers will only process events they already
have, or try to grab accept mutex again after 500ms timeout
(accept_mutex_delay[1]) if there are no other events to handle.

Consider a short test on otherwise idle server like one you are
doing, with many connections established during a small period of
time. Assume there are 2 workers:

worker A holds accept mutex, worker B waits for 500ms timeout
doing nothing;
in a short period of time 1000 connections comes in;
worker A woken up by the kernel, accepts a connection;
worker A goes back to the kernel to wait for more data; since
worker B is in kernel waiting for a 500ms timeout, accept mutex
is again locked by A;
worker A wokern up again, and the above repeats multiple times.

More or less this continues till worker B wakes up after 500ms and
tries to lock the accept mutex. If it is lucky and this happens
when worker A is doing something, it will be able to lock the
accept mutex. That is, further connections will be accepted by
worker B. If worker B isn’t lucky, then worker A will accept
connections for more time. For short tests this may mean that all
connections will be accepted by a single worker. (And things will
be even worse if multi_accept[2] is used.)

On a normally loaded server the above situation isn’t likely to
happen as all workers are priodically woken up by the kernel, and
will try to lock accept mutex when going back to the kernel. Thus
connections are distributed among all workers more or less evenly.
In short tests though, accept_mutex can easily cause disbalance as
described above.

[1] Core functionality
[2] Core functionality

–
Maxim D.
http://nginx.org/