Forgive me for not having read the whole thread, however, there is one
thing that seems to be really important, and that is, ruby hardly ever
runs the damned GC. It certainly doesn’t do full runs nearly often
enough (IMO).
Also, implicit OOMEs or GC runs quite often DO NOT affect the
extensions correctly. I don’t know what rmagick is doing under the
hood in this area, but having been generating large portions of
country maps with it (and moving away from it very rapidly), I know
the GC doesn’t do “The Right Thing”.
First call of address is GC_MALLOC_LIMIT and friends. For any small
script that doesn’t breach that value, the GC simply doesn’t run. More
than this, RMagick, in it’s apparent ‘wisdom’ never frees memory if
the GC never runs. Seriously, check it out. Make a tiny script, and
make a huge image with it. Hell, make 20, get an OOME, and watch for a
run of the GC. The OOME will reach your code before the GC calls on
RMagick to free.
Now, add a call to GC.start, and no OOME. Despite the limitations of
it (ruby performance only IMO), most of the above experience was built
up on windows, and last usage was about 6 months ago, FYI.
On 24 Mar 2008, at 20:37, Luis L. wrote:
each
request. I’m pretty sure this happens in other rails applications
that
don’t happen to use ‘RMagick’.
Personally, I’ll simply say call the GC more often. Seriously. I mean
it. It’s not that slow, not at all. In fact, I call GC.start
explicitly inside of by ubygems.rb due to stuff I have observed before:
http://blog.ra66i.org/archives/informatics/2007/10/05/calling-on-the-gc-after-rubygems/
- N.B. This isn’t “FIXED” it’s still a good idea (gem 1.0.1).
Now, by my reckoning (and a few production apps seem to be showing
emperically (purely emperical, sorry)) we should be calling on the GC
whilst loading up the apps. I mean come on, when are a really serious
number of temporary objects being created. Actually, it’s when
rubygems loads, and that’s the first thing that happens in, hmm,
probably over 90% of ruby processes out there.
sure unrelated to RMagick or garbage collection.
Yes, but even you “reclaim” the memory with GC, there will be pieces
that wouldn’t be GC’ed ever, since the leaked in the C side, outside
GC control (some of the RMagick and ImageMagick mysteries).
Sure, but leaks are odd things. Some processes that appear to be
leaking are really just fragmenting (allocating more ram due to lack
of ‘usable’ space on ‘the heap’. Call the GC more often, take a 0.01%
performance hit, and monitor. I bet it’ll get better. In fact, you can
drop fragmentation the first allocated segment significantly just by
calling GC.start after a rubygems load, if you have more than a few
gems.
Can you tell me how you addressed the “schedule” of the garbage
collection execution on your previous scenario? AFAIK most of the
frameworks or servers don’t impose to the user how often GC should
be
performed.
In fact there are many rubyists who hate the idea of splatting
GC.start into processes. Given what I’ve seen, I’m willing to reject
that notion completely. Test yourself, YMMV.
FYI, even on windows under the OCI, where performance for the
interpreter sucks, really really hard, I couldn’t reliably measure the
runtime of a call to GC.start after loading rubygems. I don’t know
what kind of ‘performance’ people are after, but I can’t see the point
in not running the GC more often, especially for ‘more common’ daemon
load. Furthermore, hitting the kernel for more allocations more often,
is actually pretty slow too, so this may actually even result in
faster processes under certain conditions.
Running a lib like RMagick, I would say you should be doing this,
straight up, no arguments.
garbage
equivalent
of this in Mongrel is the Mongrel Rails dispatcher. Since the
Mongrel Rails
dispatcher is distributed as a part of Mongrel, I’d say this code
is owned
by Mongrel, which bridges these two worlds when using mongrel as a
webserver.
It doesn’t really matter where you run the GC. It matters that it
runs, how often, and what it’s doing. If you’re actually calling on
the GC and freeing nothing, that’s stupid, but if you’ve run RMagick
up, just call GC.start anyway, and I’m pretty sure it’ll help. There’s
certainly no harm in investigating this, unless you’re doing something
silly with weakrefs.
Then you could provide a different Mongrel Handler that could perform
that, or even a series of GemPlugins that provide a gc:start instead
of plain ‘start’ command mongrel_rails scripts provides.
$occasional_gc_run_counter = 0
before_filter :occasional_gc_run
def occasional_gc_run
$occasional_gc_run_counter += 1
if $occasional_gc_run_counter > 1_000
$occasional_gc_run_counter = 0
GC.start
end
end
Or whatever. It doesn’t really matter that much where you do this, or
when, it just needs to happen every now and then. More importantly,
add a GC.start to the end of environment.rb, and you will have
literally half the number of objects in ObjectSpace.
On a personal note, I believe is not responsibility of Mongrel, as a
webserver, take care of the garbage collection and leakage issues of
the Vm on which your application runs. In any case, the GC of the VM
(MRI Ruby) should be enhanced to work better with heavy load and
long
running environments.
Right, and it’s not just the interpreter, although indirection around
this stuff can help. (such as compacting).
generally
position is
that this just be an option within Mongrel as a web server.
Right, I think this is important too. You’re absolutely right that
there’s no specific place to provide a generic solution. In rails the
answer may be simple, but that’s because rails outer architecture is
simplistic. No threads, no out-of-request processing, and so on.
–gc-interval maybe?
Now that you convinced me and proved your point, having the option to
perform it (optionally, not forced) will be something good to have.
Surely you can just:
require ‘thread’
Thread.new { loop { sleep GC_FORCE_INTERVAL; GC.start } }
In environment.rb in that case.
Of course, this is going to kill performance under evented_mongrel,
thin and so on. I’d stay away from threaded solutions. _why blogged
years ago about the GC, trying to remind people that we actually have
control. I know ruby is supposed to abstract memory problems etc away
from us, and for the most part it does, but hey, no one’s perfect,
right?
http://whytheluckystiff.net/articles/theFullyUpturnedBin.html
Patches are Welcome
Have fun! :o)