Benchmark server almost ready

luislavena · June 19, 2011, 3:37pm

(Cross-posted to JRuby ML)

I am nearly done with building out the benchmarking server for JRuby and
Rubinius. I am using the benchmarks from the Rubinius project
(rubinius/benchmark/core/*). There are around 90 benchmark files each
with a varying number of tests, so each run produces 522 reports right
now.

I have several questions and concerns that I’ll detail here. We can
discuss these either here or on irc; I have no preference.

Issues…

Benchmark names
The name given in the benchmark (x.report(“this is the name”)) acts as
the database key for the codespeed server. For continuity, we do not
want these names changing once they are in the system otherwise old
results will become decoupled from new results.

e.g. core/array/bench_append.rb, “append array” would be a different
report from “Append array”

So, I’d like to “lock down” the benchmark files so the names on old
benchmarks don’t get changed without an extra sign-off.

Additionally, some of the names are a tad long (though very
descriptive). The codespeed server wants to limit names to 30 chars (I
modified the source to get beyond this) but we probably do want to cap
the length at something reasonable like 100 chars. I pulled that out of
the air. The length limit increases readability on the codespeed
website.

Number of runs (EC2 charges)
At this time I have it configured to do two runs for Jruby and two for
Rubinius on each commit to their respective repositories. The JRuby runs
are for ‘–client’ and ‘–server’ while for Rubinius they are ‘-Xint’
and JIT-enabled. Based on the number of benchmarks, each commit takes
around 1 hour to complete. Since EC2 charges by the instance hour
consumed, this may get expensive.

Couple that with the fact that every commit causes a run and suddenly
we see a bunch of runs queue up due to documentation changes, fixing
typos, etc. Perhaps the commits need to be filtered so that those that
do not contain any source file changes (*.java, *.rb, *.hpp, *.cpp, etc)
will be ignored.

Overall, I’m concerned that this benchmark server is going to be
expensive.

Benchmark repository
I think the benchmarks need to be moved out from under the Rubinius
repository and into a separate one. Also, they each need to be modified
to work out-of-the-box with the benchmark_suite gem (they don’t work
with it now). I already know what changes need to be made, so this is
relatively simple.

The concern becomes project ownership of these benchmarks. Should it be
under the JRuby or Rubinius organizations? What happens when we add MRI
or Maglev or IronRuby or ?? to the list of supported runtimes?

I suggest that the benchmarks be spun off to a new “ruby community”
organization. This change will ensure more open access to runtimes that
are not part of the Engine Y. ecosystem (if that’s a concern to
anyone) while also providing a semblance of impartiality.

Database & webserver performance
I need to get back onto my “day job” now that this works. I don’t have
the time or skills to replace sqlite3 with a more enterprisey DB nor do
I know how to reconfigure django to use apache/nginx/etc instead of the
default python-based web server.

So, if this site is popular and gets lots of hits, it may fall over.

I’d like to put a general call out to the Ruby community to ask for some
volunteer(s) to step in and do this optimization. It’s probably okay to
wait and see if the site actually does get crushed before spending any
more time on it.

Github post-receive hook
The code that runs the benchmarks looks at commits in two ways. Upon
startup it updates each repository and looks up the last 25 commits. It
enqueues the commits it hasn’t seen yet (stored in a small sqlite db).
After this initialization step, it starts listening for commits that are
published from github.

So we’ll need each project to add a post-receive hook that points at
this server. It’s super easy to do this through the admin control panel.
Both projects probably already have these setup for the CI servers.

The code I wrote to handle all of this bookkeeping will be pushed to
github early this week. The project name is currently
‘benchmark_pipeline’ so if you hate the name please speak up now. The
code ain’t perfect but it is a decent foundation to work from. With a
little refactoring it could probably be adopted by any project that
wanted to setup their own codespeed server to track performance.

cr

cremes · June 19, 2011, 3:44pm

Chuck R. escribió:

The concern becomes project ownership of these benchmarks. Should it be under
the JRuby or Rubinius organizations? What happens when we add MRI or Maglev or
IronRuby or ?? to the list of supported runtimes?

I suggest that the benchmarks be spun off to a new “ruby community”
organization. This change will ensure more open access to runtimes that are not
part of the Engine Y. ecosystem (if that’s a concern to anyone) while also
providing a semblance of impartiality.
Chuck,

I think at least initially this project can reside under
github.com/rubyspec umbrella. rubyspec is what (hopefully) all Ruby
implementations collaborate on and many people in the Ruby community are
aware of this fact. Vendor neutrality was also one of the rubyspec
goals.

Just an idea.

MK

http://twitter.com/michaelklishin

cremes · June 19, 2011, 4:17pm

On Jun 19, 2011, at 8:44 AM, Michael K. wrote:

Chuck R. escribi:

The concern becomes project ownership of these benchmarks. Should it be under
the JRuby or Rubinius organizations? What happens when we add MRI or Maglev or
IronRuby or ?? to the list of supported runtimes?

I suggest that the benchmarks be spun off to a new “ruby community”
organization. This change will ensure more open access to runtimes that are not
part of the Engine Y. ecosystem (if that’s a concern to anyone) while also
providing a semblance of impartiality.
Chuck,

I think at least initially this project can reside under github.com/rubyspec
umbrella. rubyspec is what (hopefully) all Ruby implementations collaborate on and
many people in the Ruby community are aware of this fact. Vendor neutrality was
also one of the rubyspec goals.

Just an idea.

Good point. I like the idea.

cr

cremes · June 20, 2011, 4:50pm

On Sun, Jun 19, 2011 at 9:19 AM, Chuck R. [email protected]
wrote:

Github post-receive hook

I suppose I should have written that we don’t need to benchmark every commit.
Like the pypy project, this could be a cron job that runs once per day and
benchmarks the latest HEAD. That would certainly eliminate the concern over EC2
costs.

I think you could even run it every other day as well. Performance
fixes are not uncommon, but not knowing overall results for two days
is not such a big deal. I guess this is mostly a factor of EC2 costs
right?

-Tom

–
blog: http://blog.enebo.com twitter: tom_enebo
mail: [email protected]

cremes · June 19, 2011, 4:19pm

On Jun 19, 2011, at 8:36 AM, Chuck R. wrote:

Number of runs (EC2 charges)
At this time I have it configured to do two runs for Jruby and two for Rubinius
on each commit to their respective repositories. The JRuby runs are for ‘–client’
and ‘–server’ while for Rubinius they are ‘-Xint’ and JIT-enabled. Based on the
number of benchmarks, each commit takes around 1 hour to complete. Since EC2
charges by the instance hour consumed, this may get expensive.

Couple that with the fact that every commit causes a run and suddenly we see a
bunch of runs queue up due to documentation changes, fixing typos, etc. Perhaps
the commits need to be filtered so that those that do not contain any source file
changes (*.java, *.rb, *.hpp, *.cpp, etc) will be ignored.

Overall, I’m concerned that this benchmark server is going to be expensive.

Github post-receive hook

I suppose I should have written that we don’t need to benchmark every
commit. Like the pypy project, this could be a cron job that runs once
per day and benchmarks the latest HEAD. That would certainly eliminate
the concern over EC2 costs.

cr

cremes · June 20, 2011, 11:53pm

On Sun, Jun 19, 2011 at 8:36 AM, Chuck R. [email protected]
wrote:

Benchmark repository
I think the benchmarks need to be moved out from under the Rubinius repository
and into a separate one. Also, they each need to be modified to work
out-of-the-box with the benchmark_suite gem (they don’t work with it now). I
already know what changes need to be made, so this is relatively simple.

The concern becomes project ownership of these benchmarks. Should it be under
the JRuby or Rubinius organizations? What happens when we add MRI or Maglev or
IronRuby or ?? to the list of supported runtimes?

I suggest that the benchmarks be spun off to a new “ruby community”
organization. This change will ensure more open access to runtimes that are not
part of the Engine Y. ecosystem (if that’s a concern to anyone) while also
providing a semblance of impartiality.

It could also go under the “rubyspec” organization, which is pretty
non-specific already.

I’m all for moving them out of the repo, and it would help me justify
spending time writing more benchmarks for it. I might also start
rewriting those in JRuby’s benchmark dir to use the same structure and
deleting them from our repo too.

Charlie

cremes · June 20, 2011, 11:55pm

Ahh, I see the reference to rubyspec org now. Yes, I concur.

Also, it doesn’t appear that “rubinius-dev” is a mailing list. All my
replies are bouncing.

On Sun, Jun 19, 2011 at 8:44 AM, Michael K.