Hi Zed,
Thanks very much for your reply. I appreciate it very much. More
below.
Zed A. Shaw wrote:
Restart (USR2) isn’t that reliable because of
how ruby/rails needs to reload. You should
use a full stop/start cycle instead.
I’ve been doing that. Should have said it that way.
Once you start doing that and you still get no
response then run strace on the mongrel process
to see what it’s doing.
The folks at the hosting service (a2hosting) have done that I think and
they
tell me it hasn’t given them any clues.
It’s most likely that you have a library that’s messing
things up.
If by ‘library’ you mean gems / plugins, I do have a couple installed:
PDF::Writer and BackgroundRb. But BackgroundRb is stopped and there are
no
calls to PDF:Writer being used in the test scenerio I’ve been using. I
thought Rails didn’t load code before it needed it. Anyway, here’s what
I’m
doing.
- Start mongrel using mongrel_rails start -d -p 3002 -e production <
/dev/null >& /dev/null
- Open a browser and browse to the index action of the Rails app (the
index
action is empty and the view renders a page with a form on it that has a
group of 3 radio buttons and a submit button.)
- Close the browser.
- Wait 40 minutes, then do step 2 again.
The result of step 4 is a blank white screen, not an error message.
Neither
production.log nor mongel.log have anything written to them at that
point
other than the initial startup messages in mongrel.log and the initial
request from step 2 in production.log. When I stop mongrel, it writes a
message to both logs saying its killing a slow worker. That’s the only
clue
I get. Steps 1-4 can be repeated ad nauseum.
After their initial investigation the sys admins concluded the problem
had
to be in my code (surprise, surprise ) I had a slightly different
version that was running flawlessly on my shared account with them, so I
decided to ‘dig in.’ I copied the entire Rails application directory
for
both apps to my PC, then copied the one I was having problems with on
the
VPS over to the shared account. It ran without a problem (and, a week
later, is still running without a problem.)
Yesterday they set up a cron job outside my space to invoke the index
method
every 10 minutes. That’s ‘solved’ the problem. It doesn’t sit well
with me
though. My experience leads me to fear that whatever the real cause of
the
problem is will eventually manifest itself again. At the worst possible
time.
If you (or anyone else reading this) have any interest in understanding
what’s really going on (since mongrel’s behavior is the only clue from
an
end user perspective), please let me know. I’ll give you any access you
need. If not, I understand completely.
Thanks again for your time.
Best regards,
Bill