Embedding JRuby in a servlet - best practices for high performance

dubstep · September 12, 2011, 2:00pm

Hi guys,

What we are doing is embedding JRuby in a servlet (using
ScriptingContainer), and then running our own application framework on
top
of this, to be able to build a bunch of applications (mostly
“middleware”-oriented, supporting CRUD-operations via simple REST APIs).
This works fine, but… we are having performance problems, and memory
leakage.

Now, for the latter problem, I think it could easily be solved using
terminate() on the ScriptingContainer before the request ends, but for
the
performance problem things are a bit harder. The thing is: we have
started
off by kicking alive a single JRuby container for each request coming
in.
This has the obvious advantage of a “clean” environment every time our
Ruby
code runs. However, performance is rather bad. So, we changed the
semantics
a bit so that the ScriptingContainer is instantiated in the servlet’s
init() method.

This works, and performance is pretty OK. The first request can take
like
14 (!) seconds to run, but later requests are much faster (normally
between
100-200 ms. So why not just go with this approach? Well, there are two
remaining issues:

We don’t get a “clean” context by doing like this. For example, we
get
an exception like this: “ServletBootstrapper.rb:12 warning: already
initialized constant BUFSIZE” for every web hit. Is there a way to
“clean”
the context in Ruby (without wasting the container altogether…)? Or
possibly, if I could “clone” the container before the request is
started,
that could also work.
Sometimes we get “bumps”, individual requests taking multiple seconds
to
run. The Tomcat output shows indications in these cases that certain
gems
are being reloaded, like this:
“build/web/WEB-INF/lib/mwGems.jar!/gems/dp-1.0.1/lib/dp.rb:2 warning:
`binding’ should not be aliased”. Noteworthy to say here is that we are
currently bundling our gems (the ones we need) in an mwGems.jar file, to
keep them all in one place. If keeping them unpacked in the file system
is
much faster, we may very well consider doing like that instead.

I would say problem #1 is maybe the main issue that I’m seeing with the
current implementation, but problem #2 is also quite important to fix.

Many thanks in advance for your help.

Best regards,
Per L.