Using hashes as keys in hashes

stevena · November 23, 2005, 2:53pm

I’ve seen several posts related in some way to the subject of using
hashes as keys in hashes, but I haven’t seen a clear solution to the
issue. My problem is I want to generate a set of unique hashes.
Since hash keys are unique, I was hoping to just put the hashes in as
keys, something like:

myUniqueHashes[aHash] ||= 0
myUniqueHashes[aHash] += 1

This would not only give me a list of unique hashes, but it would
also tell me how many times each one was seen.

Unfortunately, this does not work because in a hash, each different
hash that is inserted as a key is considered to be different, even if
the contents are the same. When used as keys in a hash, two hashes
are considered equal if aHash.id = bHash.id, meaning if they are the
very same hash located at the same place in memory.

At the micro level, what I need is a special kind of hash that will
consider two hash keys to be equal if aHash == bHash. At a higher
level, I need an efficient way to collect a unique set of hashes. An
array would work, but for a large set it’d be much slower…and
storing the number of accesses would be relatively clunky compared to
a hash’s interface.

What is the Ruby way to solve this problem?

Thanks,
steve

stevena · November 23, 2005, 3:09pm

Hi –

On Wed, 23 Nov 2005, Steven Arnold wrote:

also tell me how many times each one was seen.
An array would work, but for a large set it’d be much slower…and
storing the number of accesses would be relatively clunky compared
to a hash’s interface.

What is the Ruby way to solve this problem?

You could define an appropriate default behavior for the hash of
hashes:

unique_hashes = Hash.new do |hash,key|
existing = hash.keys.find {|k| k == key }
if existing
hash[existing] += 1
else
hash[key] = 1
end
end

a = {“a”,“b”}
b = {“a”,“b”}

unique_hashes[a]
unique_hashes[b]
unique_hashes[{“some”,“other”}]

p unique_hashes # {{“some”=>“other”}=>1, {“a”=>“b”}=>2}

David

stevena · November 23, 2005, 4:06pm

On Nov 23, 2005, at 9:06 AM, David A. Black wrote:

end
This is a cool idea (and taught me about the block approach to
creating hashes, thanks!). The one (seeming) flaw is the
hash.keys.find, which would do an O(n) search on the hash.keys array,
giving O(n) instead of a hash’s normal O(ln(n)) behavior. Am I wrong
about that?

steve

stevena · November 23, 2005, 4:26pm

On Thu, Nov 24, 2005 at 12:03:48AM +0900, Steven D. Arnold wrote:

end
end

This is a cool idea (and taught me about the block approach to
creating hashes, thanks!). The one (seeming) flaw is the
hash.keys.find, which would do an O(n) search on the hash.keys array,
giving O(n) instead of a hash’s normal O(ln(n)) behavior.

[O(1)]

Yes.

unique_hashes = Hash.new do |h,k|
k2 = Struct.new(:h) do
def eql?(o); h == o.h end
def hash; h.to_a.hash end
end.new(k) # hoise the Struct def or use singleton methods for
better
# performance
h.has_key?(k2) ? h[k2] += 1 : h[k2] = 1
end

unique_hashes[{1 => 2}] # => 1
[{1 => 2}.object_id, {1 => 2}.object_id] # => [-605360772, -605360782]
unique_hashes[{2 => 3}] # => 1
unique_hashes[{1 => 2}] # => 2
unique_hashes[{1 => 2}] # => 3
RUBY_VERSION # => “1.8.3”

You can also redefine Hash#{eql?,hash} globally or using singleton
methods, of course.