The elephant in the room is that programmers can’t put the importance
of execution speed into perspective
There isn’t a lot of discussion on the Ruby list because execution
speed isn’t the biggest issue for most people.
However, when execution speed is an issue it can be a bad bad thing.
Anyhow, here are a few random thoughts (with some real numbers)
around the issue.
First off. Premature optimisation. We’ve all heard of it, even heard
that’s it is bad. Okay. Quick quiz. Which loop is faster?
#1
for i in 1…(a.length - 1) do
… do something to a[i]
end
or
#2
a.each { | thing | … do something to thing }
or
#3
for thing in a do
… do something to thing
end
??
In Ruby 1.6.x #1 was a bit faster than #3 which was a lot faster than #2
(When I replaced the ‘each’ loop form in xampl-pp – an XML pull
parser that I wrote – with the ‘for … in’ loop form the throughput
of the parser was increased by 2-4 times).
In Ruby 1.8.4, #3 is about 15% faster than #2 which is quite a lot
faster than #1
(So, I know how to speed up my parser, but I have not done this
yet… it is working well enough and because the real speed up is to
go to libxml or something).
So. What happened? What we consider to be idiomatic Ruby is now only
a little slower than non-idiomatic Ruby. What would have happened if
the community had really regarded execution speed as a primary
criteria? We’d have a different idea of ‘idiomatic’ is what.
Ruby’s focus is on something other than performance.
So what happens when speed is important and you’ve already optimised
your algorithms? These days you move out of Ruby. That’s your only
option. I thank Zed for doing that with Mongrel – it makes a big
difference.
Would a VM have helped? Sure. Sufficiently? I doubt it. Java is what,
ten times faster than Ruby? Lets say. Is that sufficient to actually
solve the problem?
Not for a lot of the examples that come up.
Definitely not for me.
Here’s an example. I’ve got a system written in Rails that is a kind
of multi-user authoring tool that runs as a webapp. My customers
build multiple inter-dependent websites using it. Publishing takes
anywhere from 7 to 45 seconds. This is Rails we are talking about
here… nobody gets to do anything while that publishing is going on.
There are a number of ways to alleviate the problem (in particular
async publishing using DRb or something like that).
Thing is, I got bored last weekend and decided to re-write the main
culprit in Common Lisp. I’m using the liquid templating system
because it is simple, my customers like it, and you can’t change any
data when using it. But it is slow (with respect to my applications
requirements) – I still recommend it. Common Lisp is roughly the
speed of C++, maybe a tad faster these days (actually, on OS X with
that obnoxious gcc 4.x error that prevents compiling at higher
optimisation levels than O1, CL is quite a bit faster than C even –
but lets not go there). The CL version of liquid was more than 120
times faster than Ruby – and I did nothing to optimise the code.
That’s publishing in less than 0.4 seconds in CL, 45 seconds in Ruby,
and lets be generous, 4.5 seconds in Java. Java doesn’t solve my
problem. I doubt a VM for Ruby would either.
What am I going to do? Well, first off, async publishing using DRb.
If I still have a problem I’ll rig up something with CL. No, I’m not
going to jump straight to CL (and just to be crystal clear here, I
like CL just fine, and I like Ruby just fine – I’ve just got other
higher priority things than raw execution speed to consider).
Lets talk a little more about webservers. I’ve written a lot of Java
based web applications over the years. I had a new webapp I had to
write and I was more than a little concerned with performance in
Java. I had prototyped something but it was horribly complex trying
to minimise page generation (it was a completely dynamic site, also
involving the construction of multi-page websites). So I wrote the
thing in Common Lisp in the end. Two things: first, it generated the
pages so quickly I didn’t have to optimise anything, it was quicker
to just re-generate; second, I used TBNL (slightly modified lisp
framework) to serve the web-pages – it was able to sustain about 70M/
s in CL vs about 15k/s in Java (I thought I was going to have a heart
attack, that’s more than 4500 times higher throughput).
What’s the point? Well, that how you do things can be pretty
important. RoR is roughly the same throughput as my java applications
have been – even using Webrick, and Mongrel is faster still –
assuming no long-running operations. Also, I still don’t do
everything in CL.
The trouble with Ruby is that there are situations where some low-
level operation is being executed so often that it becomes the
problem. For example, my XML parser string manipulation is the main
consumer of time – lots and lots of calls to pretty quick methods
(Sam R.’ example in this thread of string compares shows the
problem, string comparison can be faster than a method call, yet
still be responsible for most of your execution time). I don’t have a
solution for this. Maybe a VM will help with this but I’m not
really sure of that.
I also have my suspicions about garbage collection in Ruby. I may be
wrong to suspect GC, but you won’t need a VM to do something to speed
it up.
There are trade-offs all over the place here. And aside from using
Mongrel, I don’t think I’ve ever chosen a route (in Ruby) where
native code was involved. And that includes databases – I use a pure
Ruby thing, xampl (see link below), and don’t regret it.
Cheers,
Bob
Bob H. – blogs at <http://www.recursive.ca/
hutch/>
Recursive Design Inc. – http://www.recursive.ca/
Raconteur – http://www.raconteur.info/
xampl for Ruby – http://rubyforge.org/projects/xampl/