High i/o usage

hi guys,

i am using centos 5, nginx version: nginx/0.8.16
, and php-fpm and i have switched to nginx on one of my very busy
servers.

however, i see very high i/o usage, and the site works pretty sloppy.

below is my iostat and df output. as you can see i get around 25 mb/s
written at any time.

mysql is on a second drive, and there is pretty much nothing else.
however i cannot understand why the i/o is so high. i expect that maybe
because of log writing, but i checked the log folder, and it’s not.

I have setup in the beginnging the worker processes to 8, but for some
unknown reason after 1 2 minutes, nginx stops, so i had to increase it
to 64, and now it stays up. but i have left the i/o thing, which i
cannot make anything out of it.

Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 123.88 0.00 25.94 0 52
sda1 0.00 0.00 0.00 0 0
sda2 0.00 0.00 0.00 0 0
sda3 1.00 0.00 0.02 0 0
sda4 122.89 0.00 25.92 0 52
sdb 15.42 0.07 0.00 0 0
sdb1 15.42 0.07 0.00 0 0
sdc 69.15 0.58 1.27 1 2
sdc1 69.15 0.58 1.27 1 2

root@rum [/usr/local/nginx/logs]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda4 331G 50G 265G 16% /
/dev/sda3 15G 1.2G 13G 9% /tmp
/dev/sda1 99M 18M 77M 19% /boot
tmpfs 7.9G 0 7.9G 0% /dev/shm
/dev/sdb1 367G 60G 289G 18% /home2
/dev/sdc1 58G 13G 43G 23% /home3

On Mon, Oct 5, 2009 at 12:02 AM, Stefanita Rares Dumitrescu
[email protected] wrote:

mysql is on a second drive, and there is pretty much nothing else. however i
sda 123.88 0.00 25.94 0 52
root@rum [/usr/local/nginx/logs]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda4 331G 50G 265G 16% /
/dev/sda3 15G 1.2G 13G 9% /tmp
/dev/sda1 99M 18M 77M 19% /boot
tmpfs 7.9G 0 7.9G 0% /dev/shm
/dev/sdb1 367G 60G 289G 18% /home2
/dev/sdc1 58G 13G 43G 23% /home3

Hi,

This is a known issue with cent os 5., its the HD read/write method
used as default (though i believe that 5.
has major issues)
You will experience the same issue with any utility or tool that write
to disk (try using ‘wget’ to download a file to disk … then use wget
to write to /dev/null ‘wget -S http://site.com/file -O /dev/null’
you will see the issue. I would also recommend using “dstat” to
diagnose the problem as it will provide you with live data, something
like what you have posted above only you can monitor entire system
i/o,network,cpu resource utilization.

id recommend moving off of centOS (debain/gentoo or a better os,
ubuntu) or perhaps downgrading to 4.*
good luck,

Don’t know what to say about this, i mean i have been using centos w/
nginx for some other servers also, but i have not encountered this
issue. there is no other solution for this?

my dstat output is:

l]# dstat
----total-cpu-usage---- -dsk/total- -net/total- —paging-- —system–
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
13 8 69 8 0 2|3492k 4059k| 0 0 | 430B 767B| 410 7846
6 0 65 26 0 1| 0 22M| 506k 16M| 0 0 |7023 1082
12 3 70 15 0 1| 320k 33M| 244k 7000k| 0 0 |5372 1679
19 2 58 19 0 1| 192k 36M| 221k 7791k| 0 0 |5007 2357
34 7 33 24 0 2|9872k 32M| 541k 19M| 0 0 |7253 7920
21 5 31 41 0 2|5304k 34M| 650k 24M| 0 0 |7507 4415
29 5 31 32 0 2|1584k 38M| 714k 25M| 0 0 |8027 5247
18 4 40 35 0 2| 856k 37M| 735k 25M| 0 0 |7743 4180
24 3 50 20 0 2|3720k 34M| 791k 28M| 0 0 |7924 4753
19 4 52 23 0 2| 424k 36M| 841k 29M| 0 0 |7770 4222
23 4 53 16 0 2| 520k 34M| 781k 31M| 0 0 |7393 4735

Hello!

On Mon, Oct 05, 2009 at 09:02:48AM +0200, Stefanita Rares Dumitrescu
wrote:

mysql is on a second drive, and there is pretty much nothing else.
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 123.88 0.00 25.94 0 52
sda1 0.00 0.00 0.00 0 0
sda2 0.00 0.00 0.00 0 0
sda3 1.00 0.00 0.02 0 0
sda4 122.89 0.00 25.92 0 52

Isn’t fastcgi_temp_path points to this drive? If yes, this IO
probably caused by fastcgi buffering application responses to
disk. Solution is to tune fastcgi_buffers.

You may also consider to returning big responses directly by nginx
(e.g. via X-Accel-Redirect).

Maxim D.

On Tue, Oct 06, 2009 at 09:22:16PM +1100, Dave C. wrote:

That buffering will be logged in the error log (at info level I think).

At warn level.

That buffering will be logged in the error log (at info level I think).