Hi, I’ve started a new project that uses Mongrel. It basically
lets you defer requests to it until other Mongrels in your (Rails)
pool becomes free.
Rubyforge project page:
http://rubyforge.org/projects/qrp/
gems, tarballs and git repo in case they haven’t hit the mirrors yet:
http://bogomips.org/ruby/
I should also add that nginx 0.6.7 or later is required for the
“backup” feature I mention below in the README:
Queueing Reverse Proxy (qrp)
Ever pick the wrong line at the checkout counters in a crowded store?
This is what happens to HTTP requests when you mix a multi-threaded
Mongrel with Rails, which is single-threaded.
qrp aims to be the simplest (worse-is-better) solution and have the
lowest (adverse) impact to an existing setup.
Background:
An existing Rails site running Mongrel with nginx proxying to them.
Unlike Apache, nginx fully buffers HTTP requests from clients before
passing them off to Mongrels, which allows Mongrels to dedicate more
cycles to running Rails itself.
Problem:
Rails is single-threaded; this is (probably) not easily fixable.
By default, Mongrel will accept and queue requests while Rails is
handling another request.
Some Rails actions will take longer than others; and sometimes
several seconds can be required to respond to an HTTP request.
This problem is exacerbated if the Rails application queries
third-party servers for information.
Any queued requests inside Mongrel running Rails must wait until a
slow Rails action has finished before they can run.
If another Mongrel in the pool becomes free, then the requests that
got queued behind a still-busy Mongrel would still be stuck and unable
to get to the free Mongrel.
Disabling concurrency on the Rails Mongrel (with “num_processors: 1”
in the config)[1] will cause clients to be rejected outright and users
will see 502 (Bad Gateway) errors.
Bad Gateway errors getting returned to clients are bad, a slightly
slower site is still better than a broken site.
The developers also lack the resources to migrate to thread-safe
platform (such as Merb or Waves) at the moment.
Solution:
Disable concurrency in Mongrels running Rails is part of the solution.
Then setup a qrp or two as a backup member in your nginx
configuration.
Connections will normally go directly from nginx to Rails Mongrels (as
before). However if all your regular Mongrels are busy, then nginx
will send requests to the backup qrp instance(s).
Once a request gets to qrp, qrp will retry the all the members in a
given pool until a connection can be made and a response is returned.
This avoids extra data copies of requests for the common (non-busy)
case, and requires few changes to any existing infrastructure.
Having fail_timeout=0 in the nginx config for every member of the
Rails pool will allow nginx to immediately re-add a Rails Mongrel to
the pool once the Rails Mongrel has finished processing.
— highlights of the nginx config:
upstream mongrel {
server 0:3000 fail_timeout=0; # Rails
server 0:3001 fail_timeout=0; # Rails
server 0:3002 fail_timeout=0; # Rails
server 0:3003 fail_timeout=0; # Rails
server 0:3500 backup; # qrp
server 0:3501 backup; # qrp
}
— highlights of the qrp config:
same Rails upstreams as in the nginx config
upstreams:
- 0:3000
- 0:3001
- 0:3002
- 0:3003
…
— highlight of the mongrel config[1]:
num_processors: 1
Other existing solutions (and why I chose qrp):
Fair-proxy balancer patch for nginx - this can keep new connections
away from busy Mongrels, but if all (concurrency-disabled) Mongrels in
your pool get busy, then you’ll still get 502 errors.
HAProxy - This will queue requests for you, but only if it makes all
the connections to the backends itself. This means you cannot make
other HTTP connections to the backends without confusing HAProxy;
which (IMHO) defeats the purpose of using HTTP over a custom
protocol.
Swiftiply - admittedly I haven’t tried it. It seems to require
changes to our current infrastructure in deployment and monitoring
tools. Additionally, the extra layer between nginx and Mongrel hurts
performance for every request, not just those that get unlucky.
This also seems to take away the flexibility of being able to talk to
any individual Mongrel process using plain HTTP.
Footnotes:
[1] - The current version of mongrel (1.1.3) does not handle the
-n/–num-procs command-line option, and hence the current
mongrel_cluster (1.0.5) is broken with it:
http://mongrel.rubyforge.org/ticket/14
A better solution would be to use mongrel_cow_cluster (also a
development of mine) as it handles the "num_processors:"
directive correctly in the config file and also supports rolling
restarts.
/EOF