Looking for Set implementation in C

Shot_SPiotr_S.S · November 16, 2009, 7:57am

Hello, good folks of ruby-talk.

The Set class in MRI is implemented in pure Ruby (lib/set.rb). I vaguely
remember someone posting here a C (re)implementation of the class, but
I canâ€™t remember the details and itâ€™s hard to google it â€“ all I could
find is the [ruby-talk:24465] thread.

Does anyone know where can I look for a set.c implementation?

â€” Shot

Shot_SPiotr_S.S · November 16, 2009, 10:39am

2009/11/16 Shot (Piotr S.) [email protected]:

The Set class in MRI is implemented in pure Ruby (lib/set.rb). I vaguely
remember someone posting here a C (re)implementation of the class, but
I can’t remember the details and it’s hard to google it – all I could
find is the [ruby-talk:24465] thread.

Why do you need that? Do you encounter any performance issues in the
adapter code to Hash (which is implemented in C)?

Kind regards

robert

Shot_SPiotr_S.S · November 16, 2009, 6:38pm

I think if you have rbtree installed it uses this for the SortedSet
implementation.

Shot_SPiotr_S.S · November 16, 2009, 10:12pm

Dominic S.:

I think if you have rbtree installed it
uses this for the SortedSet implementation.

Thanks, rbtree.c does seem to be full of Ruby/C goodness.
A good read for a long Autumn evening learning session.

â€” Shot

Shot_SPiotr_S.S · November 16, 2009, 10:07pm

Robert K.:

2009/11/16 Shot (Piotr S.) [email protected]:

The Set class in MRI is implemented in pure Ruby (lib/set.rb). I vaguely
remember someone posting here a C (re)implementation of the class, but
I canâ€™t remember the details and itâ€™s hard to google it â€“ all I could
find is the [ruby-talk:24465] thread.

Why do you need that? Do you encounter any performance issues
in the adapter code to Hash (which is implemented in C)?

My initial performance tests (with the very nice perftools.rb gem from
Aman G.) seem to suggest that I spend most of the time in (C-based)
Hash#each_key, mostly due to calls from Set#each â€“ but it looks like
the Set#each calls themselves add a significant overhead. Thus, I was
thinking that maybe a Set class which does not do additional Ruby calls,
but is itself implemented in C, might be faster.

Also, I just surprisingly discovered that
if I want to have Set#pairs such that

Set[1,2,3,4].pairs.to_a # => [[1,2], [1,3], [1,4], [2,3], [2,4], [3,4]]

then doing

is actually much slower than the simple

module Enumerable
def pairs
combination 2
end
end

class Set
def pairs
to_a.pairs
end
end

which both surprises me a bit (I assumed the short lived Array creation
was expensive) and strenghtens my suspicions that the Ruby-based
Set class is rather slow and/or Array#combinationâ€™s Enumerator
initialisation is much faster (but, again, maybe because
Array#combination is written in C).

This, in turn, made me think that given that most of the core of my code
consists of operations on frozen Sets of Integers (usually Bignums),
then maybe a custom class for them (eventually most probably ported to
C or D) would make a lot of sense. As my C-fu is still somewhere between
rusty-beyond-all-hope and nonexistent, Iâ€™m looking for a Set in C so
that I can learn how itâ€™s done â€“ although I should probably just look
closely at hash.c, maybe with some ruby2c translation of set.rb to
speed-up my understanding of Rubyâ€™s C interface.

Of course, the custom (immutable) class might also have the benefit of
caching the #pairsâ€™ result, or at least its #to_a Array counterpart.

(For some reason I was also assuming that a custom IntSet class would
side-step the need for backporting r22308Â¹ to all my Ruby installs;
I then recalled that it fixes Bignum#hash, not Set#hash â€“ but now
I again think that if I build my custom IntSet class and am very
vigilant about not using Bignums in regular Sets or as Hash keys,
then I might really end up not needing the backport any more. I guess
redefining Bignum#hash to raise an exception would make a good guard
against this bug and show me how many places actually trigger it at the
moment.)

Â¹ http://redmine.ruby-lang.org/repositories/diff/ruby-19?rev=22308

â€” Shot

Shot_SPiotr_S.S · November 17, 2009, 1:54pm

2009/11/16 Shot (Piotr S.) [email protected]:

in the adapter code to Hash (which is implemented in C)?

My initial performance tests (with the very nice perftools.rb gem from
Aman G.) seem to suggest that I spend most of the time in (C-based)
Hash#each_key, mostly due to calls from Set#each â€“ but it looks like
the Set#each calls themselves add a significant overhead.

Interestingly that does not need to be the case. Try out the attached
test. I am suspecting thought that differences might be caused by GC
because I cannot see a clear pattern and in some cases MySet is faster
than Hash with each_key which technically cannot be the case because
one calls the other.

Kind regards

robert

Shot_SPiotr_S.S · November 18, 2009, 8:04am

Robert K.:

2009/11/16 Shot (Piotr S.) [email protected]:

My initial performance tests (with the very nice perftools.rb gem from
Aman G.) seem to suggest that I spend most of the time in (C-based)
Hash#each_key, mostly due to calls from Set#each â€“ but it looks like
the Set#each calls themselves add a significant overhead.

Interestingly that does not need to be
the case. Try out the attached test.

Thanks so much for the thorough tests! The results on my machine do
look interesting (benched both MRI 1.8 and 1.9, and did another run
with SMALL = 100): sb.rb · GitHub

Hmmm, I wonder how the results would look with some Ruby Inline love.

I am suspecting thought that differences might be caused by GC

perftools.rb (used to profile my code) have the vice that theyâ€™re
not really counting time spent in certain methods, but rather every
now and then (by default 100 times a second) poll the process for
the current stacktrace and do a statistic analysis afterwards:

There are two virtues of this approach, though: first, the profiling is
external to the running process and does not impact it (in particular,
does not slow the process down, nor alter the relative method execution
times); second, GC is counted separately.

Iâ€™ll see what perftools.rb show for the benched cases.

because I cannot see a clear pattern and in some cases MySet is faster
than Hash with each_key which technically cannot be the case because
one calls the other.

In my runs MySet#each is consistently faster than Hash#each_key in the
â€˜largeâ€™ benches (small numbers of iterations over larger enums) and
consistently slower in the â€˜smallâ€™ benches (large numbers of iterations
over smaller enums) â€“ and this is after re-running the benchmark with
GC.disable.

Interestingly, while in the â€˜realâ€™ part of the benchmarks 1.9 is
generally faster than of 1.8, the â€˜rehearsalâ€™ stage is sometimes
slower in 1.9. Iâ€™m wondering how does this translate to my case,
where I might have quite a few short-lived, one-time Sets (though
I probably should pull the instantiations code into the benchmark
to simulate my this case before making any guesses at it).

â€” Shot

Shot_SPiotr_S.S · November 18, 2009, 9:11am

2009/11/18 Shot (Piotr S.) [email protected]:

the case. Try out the attached test.

Thanks so much for the thorough tests! The results on my machine do
look interesting (benched both MRI 1.8 and 1.9, and did another run
with SMALL = 100): sb.rb · GitHub

You’re welcome!

In my runs MySet#each is consistently faster than Hash#each_key in the
‘large’ benches (small numbers of iterations over larger enums) and
consistently slower in the ‘small’ benches (large numbers of iterations
over smaller enums) – and this is after re-running the benchmark with
GC.disable.

This is suspicious. The only explanation I have right now is that
this is an artifact of the sampling process which is not accurate
(similar as with Java’s -Xrunhprof:cpu=samples).

Interestingly, while in the ‘real’ part of the benchmarks 1.9 is
generally faster than of 1.8, the ‘rehearsal’ stage is sometimes
slower in 1.9. I’m wondering how does this translate to my case,
where I might have quite a few short-lived, one-time Sets (though
I probably should pull the instantiations code into the benchmark
to simulate my this case before making any guesses at it).

Yet another interesting question. The tests should of course
mimic the real world case as closely as possible.

Thanks for the interesting discussion.

Kind regards

robert