I’ve been investigating a fairly infrequent issue related to
websockets being dropped and attempted downgraded to long-polling when
talking to a cometd server (a proprietary java server with jetty
embedded).
To simulate the observed behaviour from our browser clients, I’ve
written my own test script which simply does a
connect/subscribe/disconnect in sequence, one at a time.
When run through nginx, after a short while (seconds), the event
stream stops, the client tries a reconnect using long-poll which also
fails, before a successful reconnect is completed using websockets and
the responses are fast for a few seconds again.
When I run the same client without proxying through nginx, connecting
straight to the server, I have no such issues. It never drops any
connections and responses are always fast.
Considering I’m only doing one request at a time I believe it really
shouldn’t be necessary to tweak the default nginx settings, but I have
done it anyway. Increasing various connection and timeout related
limits, both inside nginx and the linux os, seems to make the
stoppages less frequent, but they still happen fairly frequently (at
least compared to outside nginx where it never happens).
Another weird observation is that if I start another client on the
same machine shortly after the original client stops, the new client
still runs fast for a few seconds before encountering similar issues.
If nginx was out of some resources I would expect the second client to
more or less immediately stop, but it does not seem to happen.
Despite this, I do believe I am hitting issues related to nginx,
possibly running out of resources because nginx behaves differently
than the server it proxies for with regards to not freeing sockets or
similar.
I’ve also looked at the number of listening sockets and sockets being
kept around after close etc (the normal “server tuning stuff”), but
I’m not finding any significant numbers.
I am looking for advice and/or pointers on how to avoid this issue
(again, keeping in mind this is testing ONE client doing sequential
requests).
Any help would be appreciated.
Thanks,
Marius K.