Mongrel error : EMFILE too many open files

I have a periodically_call_remote call in a partial running at 0.20
times a second.

<%= periodically_call_remote(:update => ‘ack_distance_output’,
:frequency => 0.2,
:url=>{:action => :ackAdjustDistance})
%>

the action looks like this

def ackAdjustDistance
@calMessage = Calibration.getMessage
if(@calMessage.Value == “4”) # CALVAR_LINE_EVENT
render(:inline => %{
<%= image_tag(“stop.png”, :class => “bevel”,
:width => “256”, :height => “256”) %>


})
return
else
render(:inline => %{ <%= image_tag(“arrow_down.png”, :class =>
“bevel”,
:width => “256”, :height => “256”) %> })
return
end
end

after a couple minutes I get the following error

Errno::EMFILE (Too many open files -
script/…/config/…/tmp/sessions//ruby_sess.3b99572316c49027):

Once this happens mongrel can’t find anything. the only recovery is the
restart the web server.

The periodic render has an image_tag which appears to be opening the
image file and its not being released emediatly?

Anything I can do to fix this?

Scott

On Dec 14, 2007 7:16 PM, Scott D. [email protected] wrote:

I have a periodically_call_remote call in a partial running at 0.20
times a second.

[…]

after a couple minutes I get the following error

Errno::EMFILE (Too many open files -
script/…/config/…/tmp/sessions//ruby_sess.3b99572316c49027):

Once this happens mongrel can’t find anything. the only recovery is the
restart the web server.

First suggestion: move away from pstore for session storage (use
activerecord, sqlstore or even memcached one, but no pstore).

The periodic render has an image_tag which appears to be opening the
image file and its not being released emediatly?

image_tag shouldn’t be opening any file, just linking it.

PStore isn’t good when tmp/sessions is full or one ruby process is
trying to handle lot of sessions (which will exceed 1024 file limit).


Luis L.
Multimedia systems

A common mistake that people make when trying to design
something completely foolproof is to underestimate
the ingenuity of complete fools.
Douglas Adams

On Dec 14, 2007 5:16 PM, Scott D. [email protected] wrote:

Anything I can do to fix this?

Scott

First, if it’s creating a session for every hit something is not correct
with the Rails code, investigate this.
If you cannot track it down try using either cookie or database
sessions.

That said, if you need to have more open files you can adjust you’re
open
file limit:
See the file limit:

$ ulimit

Adjust the file limit:

$ ulimit 2048

^ ^ allows up to 2048 open files.

~Wayne

Thanks,

Luis L. wrote:

First suggestion: move away from pstore for session storage (use
activerecord, sqlstore or even memcached one, but no pstore).

I will try this immediately.

Scott

Wayne E. Seguin wrote:

First, if it’s creating a session for every hit something is not correct
with the Rails code, investigate this.
If you cannot track it down try using either cookie or database
sessions.

If changing the sessions store method doesn’t fix this I will dig
deeper…

Adjust the file limit:

$ ulimit 2048

tried that and it just delayed the failure…

thanks, Scott

Well I changed the session store to memory_store and it still crashes…
I don’t get the EMFIle error, mongrel just stops responding to any
request.

I would try database store but I can’t put the session store in my
default database for concurrency reasons. I tried to find instructions
on how to set it up to use a separate database file but haven’t found
that yet…

I also wonder about why mongrel was still using pstore after I upgraded
to rails 2.0.1? I thought the default for 2.0 was to store it in the
cookie file on the client? Mine was still creating pstore files in tmp?
Shouldn’t it use cookie by default?

Really a drag having my web server crash just because somebody sits on a
page for more than 10 minutes.

Scott

On Dec 16, 2007 12:24 PM, Scott D. [email protected] wrote:

to rails 2.0.1? I thought the default for 2.0 was to store it in the
cookie file on the client? Mine was still creating pstore files in tmp?
Shouldn’t it use cookie by default?

Nobody else has said it, so I will.

This is not a Mongrel problem.

It’s a problem related to the web framework that you are using, and/or
how you are using that framework. Mongrel doesn’t have anything
whatsoever to do with your application session management details.

I imagine that you will find better, more specific advice by asking
your question in a Rails specific forum.

Kirk H.

Kirk H. wrote:

Nobody else has said it, so I will.
This is not a Mongrel problem.

Kirk,

thats certainly possible thats its not a problem specific to Mongrel. I
think its more likely related to my upgrade to Ruby 2.0.1. Its also
possible its a CGI problem as CGi is used for session store to memory or
pstore.

It is odd that what ever the problem, Mongrel crashes… It won’t
respond to any http request from any client. Its dead in the water.

That makes it a Mongrel problem in my eyes, at least to anybody who
cares if Mongrel is reliable or not.

Scott

On Dec 17, 2007 8:16 AM, Scott D. [email protected] wrote:

thats certainly possible thats its not a problem specific to Mongrel. I
think its more likely related to my upgrade to Ruby 2.0.1. Its also
possible its a CGI problem as CGi is used for session store to memory or
pstore.

Not Ruby 2.0.1. It doesn’t exist.

Rails 2.0.1?

Ruby != Rails.

http://rubyisnotrails.com/

It is odd that what ever the problem, Mongrel crashes… It won’t
respond to any http request from any client. Its dead in the water.

That makes it a Mongrel problem in my eyes, at least to anybody who
cares if Mongrel is reliable or not.

No, it doesn’t, because it doesn’t have anything to do with mongrel
reliability.

The problem is related to Rails deadlocking. When that happens,
control never returns to Mongrel; it is out of Mongrel’s control.
It’s something happening inside of either the Rails code or your
application’s code. It is not happening in Mongrel’s code. Mongrel
is a software component that is responsible for a narrow range of
tasks.

  1. It receives HTTP requests, and parses them.

  2. It determines, based on the request uri, which of it’s registered
    handlers should handle the request.

  3. It calls the handler, handing it request and response objects.

  4. There are some differences in how the handlers for various
    frameworks actually work, but in the end they all do one thing that is
    the same – they pass control from the handler into the framework’s
    code.

  5. Session handling occurs here, inside of the framework’s code, which
    includes the code for your application. As does the rest of whatever
    your application does to construct it’s response.

  6. When the framework/app code finishes constructing the response,
    control flows out of the handler and back to Mongrel.

  7. Mongrel sends the response.

That’s the basic cycle of operation. Your problem is almost certainly
inside #5.

Kirk H.

On Dec 17, 2007 10:05 AM, Scott D. [email protected] wrote:

cause the server to refuse any new connections? Its not like its really
busy with a couple hundred requests a second, its 5 requests a second,
on a fast server?

Yeah. The standard Mongrel is multithreaded using Ruby green threads.
When a request comes in, a new ruby thread is created to handle that
request. This isn’t true concurrency because all of the green threads
are handled inside the single process, but it lets Mongrel have
multiple things in-process at the same time, which is occasionally
useful (though not as big a deal as a lot of people seem to think,
though).

Rails, however, is unsafe when there is more than one request
in-process at the same time. So the handler for Rails has a mutex in
it that locks the Rails processing to a single request at a time.

In addition, Mongrel is subject to the same limits on file descriptors
and open files that your app as a whole is.

My suspicion is that your Rails code is eventually deadlocking on an
exhausted open files limit, and when it does that, everything else is
blocked up and nothing moves anymore.

I don’t do Rails, so I’m not going to be of much help in suggesting
solutions, but it sounds like something about how sessions are being
handled is leaving open filehandles laying around. That’s why I
suggest taking the question to a Rails specific forum, where you can
probably find someone who knows the details about the session
management option that you are using, and can pinpoint the problem for
you.

Kirk H.

Kirk H. wrote:

It is odd that what ever the problem, Mongrel crashes… It won’t
respond to any http request from any client. Its dead in the water.
That makes it a Mongrel problem in my eyes, at least to anybody who
cares if Mongrel is reliable or not.

No, it doesn’t, because it doesn’t have anything to do with mongrel
reliability.

I appreciate your responses and am looking at everything to find the
problem. Believe me I’m not looking for a scapegoat, I’m looking for a
solution.

The reason why I think this is a mongrel reliability issue is that when
my mongrel server stops responding, its stops responding to everybody!
Any browser from any client machine that trys to access any page on the
website gets “nada”, no response…

I can’t even get a page not found 404 error if I feed it a bad address,
the mongrel server is locked up and not responding, to anybody.

Maybe I don’t understand the linkage between my web app and the web
server, I didn’t think my application by running a periodic update could
cause the server to refuse any new connections? Its not like its really
busy with a couple hundred requests a second, its 5 requests a second,
on a fast server?

Scott

Scott

On Dec 17, 2007 11:05 AM, Scott D. [email protected] wrote:

The reason why I think this is a mongrel reliability issue is that when
my mongrel server stops responding, its stops responding to everybody!
Any browser from any client machine that trys to access any page on the
website gets “nada”, no response…

I don’t have your original post, so you may have outlined these details,
but:
Are you running more than 1 mongrel? What are you using to balance:
Apache’s mod_proxy, hardware load balancer, or something else (nginx,
etc)? Last time I checked, Nginx and Apache use round-robin for
rotating through your mongrel ports. So, if one mongrel process is
taking awhile to respond, the balancer can’t tell and could send a new
user to that port causing a hanging request. There are alternatives
that will allow the hung/slow ports to be skipped after a certain
threshold, but nothing that uses apache (to my knowledge). Here’s a
recent one for Nginx:
http://brainspl.at/articles/2007/11/09/a-fair-proxy-balancer-for-nginx-and-mongrel
If you’re just using one mongrel, then this won’t apply.

What Kirk is explaining is that it’s not mongrel that is hanging on
the request, it’s the (single-threaded) Rails code. He should know
all about that seeing as he wrote Swiftiply (which you might look into
as a solution). http://swiftiply.swiftcore.org/

ed

Kirk H. wrote:

All that makes sense if

1.) sessions is the problem. I upped the file handle limit to 4096 and
now I don’t get an EMFILE error from mongrel, mongrel just stops
responding to requests…

2.) Mongrel and my Rails app share the same file handle limit? I
thought they were separate ruby apps? Or do all Ruby apps share the
same environment?

I’ve run mongrel with the -D flag and nothing in the log file.

other apps on the system are running fine so its certainly not a system
wide problem.

I have posted this problem on multiple Ruby and Rails forums, no luck in
an answer so far. I’m going to remove and reinstall ruby, gems and
rails to see if the multiple updates over the last 6 months have left
some cruft that is causing the problem.

On Dec 17, 2007 10:05 AM, Scott D. [email protected] wrote:

cause the server to refuse any new connections? Its not like its really
busy with a couple hundred requests a second, its 5 requests a second,
on a fast server?

Yeah. The standard Mongrel is multithreaded using Ruby green threads.
When a request comes in, a new ruby thread is created to handle that
request. This isn’t true concurrency because all of the green threads
are handled inside the single process, but it lets Mongrel have
multiple things in-process at the same time, which is occasionally
useful (though not as big a deal as a lot of people seem to think,
though).

Rails, however, is unsafe when there is more than one request
in-process at the same time. So the handler for Rails has a mutex in
it that locks the Rails processing to a single request at a time.

In addition, Mongrel is subject to the same limits on file descriptors
and open files that your app as a whole is.

My suspicion is that your Rails code is eventually deadlocking on an
exhausted open files limit, and when it does that, everything else is
blocked up and nothing moves anymore.

I don’t do Rails, so I’m not going to be of much help in suggesting
solutions, but it sounds like something about how sessions are being
handled is leaving open filehandles laying around. That’s why I
suggest taking the question to a Rails specific forum, where you can
probably find someone who knows the details about the session
management option that you are using, and can pinpoint the problem for
you.

Kirk H.

On Dec 17, 2007 10:30 AM, Scott D. [email protected] wrote:

same environment?
Mongrel is an app container. Your Rails app runs inside the same
process as Mongrel does. Mongrel wraps around it and provides the
facilities to get the request from the client to your app, and the
response from the app to the client.

When a Mongrel seems to be hanged, have you tried running
strace/truss/something-similar on it?

If you are on a Linux system, you’d do something like this:

strace -s8192 -v -p PID -o OUTFILE

Let it go for a few seconds, then break with ctrl-c.

In OUTFILE will be a bunch of lines containing the system calls that
were being made inside the process.
Sometimes this can give one a hint regarding where a piece of code is
stuck.

Kirk H.

On Dec 17, 2007, at 11:05 AM, Scott D. wrote:

The reason why I think this is a mongrel reliability issue is that
when
my mongrel server stops responding, its stops responding to everybody!
Any browser from any client machine that trys to access any page on
the
website gets “nada”, no response…

I can’t even get a page not found 404 error if I feed it a bad
address,
the mongrel server is locked up and not responding, to anybody.

You generally only have one (or a few) Mongrel(s) running for
everybody. That’s Just How It Works. If Rails hangs in it, it’ll hang
for everybody.

Try setting your app up using Webrick or FastCGI – you’ll very
likely see similar behavior there. If Rails hangs, your web server
will appear to hang – as it will be all busy waiting for Rails to
finish its job.

Mongrel generally handles 404 errors even if you have something like
Apache on the front end – because you don’t really have a “page”
for /my_controller/ackAdjustDistance – that URL needs to have Rails
behind it.

Maybe I don’t understand the linkage between my web app and the web
server, I didn’t think my application by running a periodic update
could
cause the server to refuse any new connections? Its not like its
really
busy with a couple hundred requests a second, its 5 requests a second,
on a fast server?

Here’s a question – does your app run forever without the periodic
updater? Can you make hundreds / thousands of requests and never see
a problem? Also – watch your ajax requests with Firebug – do they
all succeed until, finally, one hangs?

The solution is almost certainly gonna be in your application code or
in your Rails setup – there’s something in there that’s not
releasing a limited resource. Increasing resource limits, adding
Mongrels, switching app servers… these will all, at best, delay the
onset of symptoms.

Cheers,
-Nate

PS - At the risk of going too far off-topic… Calibration.getMessage
() looks like an interesting line of code. Is there any chance that
function blocks while waiting for something to happen? Or maybe it,
say, reads from a file but doesn’t properly close it? Or has the
possibility of a race condition and Bad Behavior if something updates
the message while getMessage() is running?

I replaced mongrel with webrick and the problem went away.

The offending page has been running for 45 minutes now, using mongrel I
always died between 5 and 10 minutes.

Guess I’ll use lighthttp or something…

Scott

On Dec 17, 2007 12:16 PM, Scott D. [email protected] wrote:

First, sorry If I’m not polite enough, but why you upgrade something
if isn’t broken?

Also, why did you upgrade a production site/application without prior
sandboxing/staging test of it?

It is odd that what ever the problem, Mongrel crashes… It won’t
respond to any http request from any client. Its dead in the water.

Mongrel isn’t crashing, the app that runs on top of it isn’t returning
control to it to keep serving request. Both Mongrel and Rails share
the same process:

Rails is a web framework that needs a HTTP server.
Mongrel is an HTTP server that parse HTTP request and send them to
Rails.
Period.

Rails (not ruby) is unsafe to handle more than 1 request at the time,
that’s why you need a cluster of mongrels to server a middle-to-high
traffic site.

Draw a line between them. Mongrel do not crash your application, your
application isn’t returning control to Mongrel to keep serving other
requests.

Also you should start checking your log files to see what was the
latests request served by your rails application, if somehow that
could help you.

That makes it a Mongrel problem in my eyes, at least to anybody who
cares if Mongrel is reliable or not.

Again, you should take as policy test everything in development, then
staging and then production.

Blaming Mongrel because your rails application hang isn’t something
will help you get the right answer to fix your problem.

To consider it a problem, you should also need to test other
scenarios, which you didn’t.

a plain script/server webrick -e production worked for you? I know
that will not be performance wise, but at least, you can see what
isn’t working.

Also you can test the FastCGI scenario and see if that work.

The EMFILE limit is because something in Rails side is exhausting file
handles:
too many render :partial
too many orphaned (not garbage collected TempFiles)

PStore is a option of session storage of Rails, not Mongrel. PStore
uses 1 temp file per session, being a session each individual user
requesting a page.

You need to look at Rails documentation to properly upgrade your Rails
1.x application to 2.x, and do it on testing before doing on
production!


Luis L.
Multimedia systems

A common mistake that people make when trying to design
something completely foolproof is to underestimate
the ingenuity of complete fools.
Douglas Adams

On Dec 21, 2007 12:37 PM, Scott D. [email protected] wrote:

Well, at 1 hour 15 minutes webrick froze. I had to kill it and it
output two “deadlock” errors.

I guess rails just can’t handle running periodically_call_remote with an
interval shorter than 1 second. eventually two calls are synced up and
it deadlocks. the faster I run periodically… the sooner the deadlock.

You’re trying to handle too many things in less than 1 second, and
just 1 instance of most of web frameworks in ruby will not handle it
properly.

I don’t know if periodically_call_remote opens a new connection to the
server, but that could mean it’s exhausting available file handlers
and thus, ending in the problem you’re getting.

Maybe you need a different approach to this problem, why not push
server instead of pulling ones? Take a look at comet-like
functionality from javascript side to server side.

If I run lighttpd as a load balancer with cluster would that alleviate
the problem? If that is the problem?

Alleviate: yes, solve: no-

As I said, you need a different strategy of this problem, and not
“drop more hardware to the problem” – this will not solve it, just
delay the inevitable end: server crash.


Luis L.
Multimedia systems

A common mistake that people make when trying to design
something completely foolproof is to underestimate
the ingenuity of complete fools.
Douglas Adams

Well, at 1 hour 15 minutes webrick froze. I had to kill it and it
output two “deadlock” errors.

I guess rails just can’t handle running periodically_call_remote with an
interval shorter than 1 second. eventually two calls are synced up and
it deadlocks. the faster I run periodically… the sooner the deadlock.

If I run lighttpd as a load balancer with cluster would that alleviate
the problem? If that is the problem?

Scott

Scott D. wrote:

I replaced mongrel with webrick and the problem went away.

The offending page has been running for 45 minutes now, using mongrel I
always died between 5 and 10 minutes.

Guess I’ll use lighthttp or something…

Scott