Speeding up "require 'rubygems'"

Thought I’d post here in case others have any feedback…

“loading rubygems takes quite awhile on jruby”

So gents, what can we do?

Seems that yyparse takes a small chunk of time (benchmarking a file that
just does load ‘rubygems’)

Interpreted + native Method
12.4% 21 + 0 org.jruby.parser.DefaultRubyParser.yyparse

Maybe jruby can cache those outputs somehow, similar to rbx’s .rbc
files?

Among the offenders, of 1.5s total, the following requires took quite
awhile

date
0.187999963760376

yaml
0.421999931335449

rubygems/config_file
0.141000032424927

Not sure if that can be sped up at all. Maybe could multi-thread some
of the requires so that it can leverage dual processors?

One option that I’ve considered before is to basically replace gem
“binary scripts” with ones that, by default use gem_prelude (and hard
code the bin/xxx file location to the latest installed version).

That works “fine” for cases where scripts never use more than
require ‘rubygems’ # this would become just prelude
require ‘some file’ # uses the lib paths

If scripts were to ever use something like
gem ‘xxx’, ‘>= 3’

however, we would need to load full-blown rubygems (to load all the
specs).

I have something like this going for windoze and it results in much
speedup. For simple scripts.

For more complex gems that need more…I guess the only option would be
to store the gem specs in a database and then use that for the lookups
desired (and actualize it when it gets out of date–which is reasonably
rare so probably not too painful for end users). Then you would only
need to load “full blown” rubygems when you need something like

Gem::Specification

for creating gems or what not.

The real question is though…are we addressing the true root of the
problem…why can MRI do a ‘require “rubygems”’ in 0.1s and jruby 1.5,
including basically all the same requires? If that could be addressed
then it would help load times for other projects that also use jruby
(ex: Redcar takes 10s to start for me in Linux–3s spent in requires).

Anyway, of the options discussed, any feedback?

The first option is easier.
Thanks.
-r

Thoughts below…

On Thu, Feb 25, 2010 at 10:40 AM, Roger P. [email protected]
wrote:

Seems that yyparse takes a small chunk of time (benchmarking a file that
just does load ‘rubygems’)

 Interpreted + native  Method
 12.4%   21  +   0   org.jruby.parser.DefaultRubyParser.yyparse

Maybe jruby can cache those outputs somehow, similar to rbx’s .rbc
files?

I proposed some improvements to DefaultRubyParser that we might be
able to roll into 1.5. Basically, the parser that Jay generates
already has to be post-processed to fix some issues and add some
optimizations. A few additional tweaks could make it faster.

It’s hard to nail down, though…parsing does seem to be pretty fast
for us these days, but it’s a lot slower when the JVM is still cold.
There’s only so much we can do to compensate for the JVM warmup time.

Caching parse results may be worth looking into. We used to be able to
serialize the AST to disk, and removed it because parsing was only a
small fraction of JRuby’s then-slow performance. Nowadays, we may see
some real benefit from persisting parse results to disk.

Among the offenders, of 1.5s total, the following requires took quite
awhile

date
0.187999963760376

date is a big library, but it also runs a bunch of code on startup.
Some of that could possibly be eliminated (and maybe has been in
1.9.2?). I have not explored the possibilities here.

yaml
0.421999931335449

This is somewhat troublesome, since this should be a mostly-Java
library. My current theory is that yaml is not shipping a set of
generated handles (like we do for JRuby core classes) which means
those handles have to be generated, verified, and classloaded separate
from JRuby proper. I’ll look into this today.

rubygems/config_file
0.141000032424927

Not sure if that can be sped up at all. Â Maybe could multi-thread some
of the requires so that it can leverage dual processors?

Threading requires is an interesting idea. For some of these
libraries, it could actually help a lot, since concurrent requires of
the same lib will block. We’d have to be careful not to concurrently
require anything circular, but hopefully there’s no code like that in
stdlib.

however, we would need to load full-blown rubygems (to load all the
specs).

I have something like this going for windoze and it results in much
speedup. Â For simple scripts.

I really just wish we could get some smarts rolled back into RubyGems.
The amount of work it does on every boot is really rather absurd, but
nobody’s bothered to fix it because it’s “fast enough” on MRI.
Unfortunately, it’s not “fast enough” on any of the jitting VMs, so we
suffer tremendously because of cold performance and all that boot-time
work.

As you mentioned offline, faster_require gets much of its improvement
from eliminating all the file statting that’s so slow for MRI on
Windows. Once we deal with these other issues (cold parse performance,
too much ruby code executing on rubygems boot, slow builtin library
loads), stat may be the next “big thing” to fix.

For more complex gems that need more…I guess the only option would be
to store the gem specs in a database and then use that for the lookups
desired (and actualize it when it gets out of date–which is reasonably
rare so probably not too painful for end users). Â Then you would only
need to load “full blown” rubygems when you need something like

Gem::Specification

for creating gems or what not.

A “real” database that’s updated when installed gems change and
doesn’t require re-calculating all that data on every boot would be
excellent. We’ve toyed with the idea of implementing something atop
HSQLDB or Derby, since they’re pure-Java and have nice compact on-disk
formats, but RubyGems is pretty complicated to hack such a database
into.

The real question is though…are we addressing the true root of the
problem…why can MRI do a ‘require “rubygems”’ in 0.1s and jruby 1.5,
including basically all the same requires? Â If that could be addressed
then it would help load times for other projects that also use jruby
(ex: Redcar takes 10s to start for me in Linux–3s spent in requires).

I posted a blog entry about “startup time tips” here:

The #1 thing you can do to improve startup is to make sure you’re
running the client VM. After that, there’s various other tricks (and
lots of areas I think we can improve in JRuby proper).

The bottom line, sadly, is that the JVM is not a fast VM during the
first 5 seconds of its run. Even Java code can be an order of
magnitude slower during those first crucial seconds, so even our
“nuclear option” of writing pieces in Java may not be enough. The
unfortunate truth about Ruby libraries is that they simply do a lot of
execution during load, so our only two options are to eliminate all
that boot-time execution or to make it run a lot faster to compensate
for a cold VM.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

I really just wish we could get some smarts rolled back into RubyGems.
The amount of work it does on every boot is really rather absurd, but
nobody’s bothered to fix it because it’s “fast enough” on MRI.
Unfortunately, it’s not “fast enough” on any of the jitting VMs, so we
suffer tremendously because of cold performance and all that boot-time
work.

As you mentioned offline, faster_require gets much of its improvement
from eliminating all the file statting that’s so slow for MRI on
Windows. Once we deal with these other issues (cold parse performance,
too much ruby code executing on rubygems boot, slow builtin library
loads), stat may be the next “big thing” to fix.

What about using gem_prelude in 1.8? This avoids loading full
rubygems…well, until somebody uses

gem ‘xxx’
or
Gem::Dependency

which they can avoid if they try :slight_smile:

Of course, optimizing rubygems wouldn’t actually help other projects’
load times, ex: Redcar (10s)…hmm…it would appear from your comments
that it “cannot be fast for the first 5 seconds,” except for the things
we’ve been discussing, like trying to optimize the parser/using
spork/etc…

I suppose we should all be used to slow startup times…if we’ve ever
used rails, anyway :slight_smile:

That being said I’d still like to checkout jruby’s characteristics using
faster_require, and plan on it sometime. Our hunch is it won’t help
much…

Guess we’re out of luck for the general case.

-rp

Hi Charlie,

On Tue, Mar 2, 2010 at 7:18 PM, Charles Oliver N.
[email protected] wrote:

yaml
0.421999931335449

This is somewhat troublesome, since this should be a mostly-Java
library. My current theory is that yaml is not shipping a set of
generated handles (like we do for JRuby core classes) which means
those handles have to be generated, verified, and classloaded separate
from JRuby proper. I’ll look into this today.

Yaml also monkey-patches various core libs (like date/time, etc).
And in order to do so, it first loads all those libraries. So, even
though nobody really needs, say, date library,
it will be loaded by Yaml library.

We even a bug for this: http://jira.codehaus.org/browse/JRUBY-4268

Thanks,
–Vladimir


To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Tue, Mar 2, 2010 at 1:21 PM, Roger P. [email protected] wrote:

What about using gem_prelude in 1.8? Â This avoids loading full
rubygems…well, until somebody uses

gem ‘xxx’
or
Gem::Dependency

which they can avoid if they try :slight_smile:

That’s certainly a possibility, if it doesn’t change any behavior
normal users would see. My reading of the code is that it basically
just does a fast scan of gems from the known locations and makes their
lib dirs available for load/require…which certainly avoids most of
the RubyGems boot overhead.

Of course, optimizing rubygems wouldn’t actually help other projects’
load times, ex: Redcar (10s)…hmm…it would appear from your comments
that it “cannot be fast for the first 5 seconds,” except for the things
we’ve been discussing, like trying to optimize the parser/using
spork/etc…

The tips in my blog post are probably the best place to start for e.g.
RedCar startup, but RedCar also loads a lot of stuff. The best way to
make things feel like they start up fast is to show something as
early as possible in the process, and load things after that. I hate
to say “splash screen” but getting the RedCar UI up as quickly as
possible would do a lot for perceived slowness.

I suppose we should all be used to slow startup times…if we’ve ever
used rails, anyway :slight_smile:

That being said I’d still like to checkout jruby’s characteristics using
faster_require, and plan on it sometime. Â Our hunch is it won’t help
much…

I used faster_require for a while, and I think it definitely has merit
(and helps startup a bit too). Ultimately, though, we need a better
way to boot RubyGems without all the re-scanning, re-evaluating that
it does right now.

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

On Tue, Mar 2, 2010 at 2:57 PM, Vladimir S. [email protected]
wrote:

Yaml also monkey-patches various core libs (like date/time, etc).
And in order to do so, it first loads all those libraries. So, even
though nobody really needs, say, date library,
it will be loaded by Yaml library.

We even a bug for this: http://jira.codehaus.org/browse/JRUBY-4268

If the monkey-patching only adds, it could possibly be done without
loading the library (since loading the library later would just
inherit the monkey patches). We could also modify those libraries to
check if defined? YAML and load the relevant bits themselves, rather
than YAML actively infecting the world. That might be the cleanest
option…

  1. requiring YAML first would make YAML be defined, and subsequent
    libraries loaded would load the monkey patches
  2. requiring YAML later would see that those libraries are loaded and
    add the missing bits

Yeah for dependency-tracking hell combined with monkey-patching hell :slight_smile:

  • Charlie

To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email

Continued discussion a bit here:

http://groups.google.com/group/ruby-optimization/browse_thread/thread/a25576291813a05b