The Mongodb project is adding a native java extension to support JRuby
from their Ruby driver. You can see the current code here [1].
There’s a “slight” problem though. The native Java code is only about 2x
faster than pure ruby which makes it around 5x slower than MRI using the
C extension. To make matters worse, the MRI C extension itself is about
10x slower than the pure C driver and 8x slower than the pure Java
driver. All in all, Ruby performance in general is quite poor.
I’ve forked the project and submitted a few performance patches back
into this JRuby branch, but its performance still sucks. I have an idea
for improving it but before I go and write a bunch of code I’d like to
ask here to make sure it’s a feasible idea.
The current Java code makes sure that it boxes all primitives as JRuby
primitives before doing any real operations on them. For instance, a Map
is allocated as a RubyHash, an integer is allocated as a RubyFixnum and
a string is allocated as a RubyString. Then the JavaEmbedUtils are used
for calling ruby methods to operate on these boxed objects.
e.g.
RubyString rkey = RubyString.newString(_runtime, name);
JavaEmbedUtils.invokeMethod(_runtime, current, “[]=”,
new Object[] { (IRubyObject)rkey, o }, Object.class);
I’m wondering if all of this boxing of primitives can be avoided or at
least done lazily. I’ve written a bit of Ruby before that access Java
objects like Maps and Lists and I recall that the runtime added a bunch
of syntactic sugar to these classes. So even though Map does not define
the #[]= method, I could use that and JRuby would make sure the right
thing got done. Similarly, I remember from those old experiments that my
Maps and Lists could contain Java primitives (int, Integer, String, etc)
and I didn’t need to do anything special to use them from Ruby. Again,
JRuby did the right thing.
So here is my real question (I love to bury the lede).
Can I modify the driver to just use Maps, Lists and regular Java boxed
primitives and leave it up to the runtime to lazily convert them to Ruby
objects as they are accessed? What are the downsides? Can I count on
this behavior being supported in future versions of JRuby?
If that takes care of improving the decoding phase, the encoding phase
is still rather pokey. What tricks can be used to rapidly convert Ruby
objects to Java objects so they can be BSON encoded very fast?
I’ve run the existing code under the VisualVM profiler, but it’s very
hard to pick out the slow spots when there is so much internal JRuby
stuff in use.
Thanks for any suggestions.
cr
[1]
http://github.com/mongodb/mongo-ruby-driver/blob/jruby/ext/java/src/org/jbson/
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email