Nested hash with arrays for default value

Glen_H · January 25, 2010, 6:52pm

I’m trying to find a “nice” way to make a nested hash with an empty
array as
the default “leaf” value.

Basically I’d like to be able to make an assignment as follows:

data[2][3][4][5] << 3

I can get close but I can’t get it right. The data is going to be
coming
straight out of a log so I can’t really build the hash ahead of time.

–
“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I canâ€™t hear a word youâ€™re saying.”

-Greg Graffin (Bad Religion)

Glen_H · January 25, 2010, 7:00pm

Glen H. wrote:

I’m trying to find a “nice” way to make a nested hash with an empty
array as
the default “leaf” value.

Basically I’d like to be able to make an assignment as follows:
data[2][3][4][5] << 3
I can get close but I can’t get it right. The data is going to be
coming
straight out of a log so I can’t really build the hash ahead of time.

–
“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I canâ€™t hear a word youâ€™re saying.”

-Greg Graffin (Bad Religion)

An empty array?
Well… You can try this:
http://trevoke.net/blog/2009/11/06/auto-vivifying-hashes-in-ruby/
As indicated, I didn’t come up with this, and it’ll take care of
creating the hashes for you. You can probably do a check : if nil, then
create array… Then add to array.

Glen_H · January 25, 2010, 7:17pm

On Mon, Jan 25, 2010 at 6:48 PM, Glen H. [email protected]
wrote:

–
“Hey brother Christian with your high and mighty errand, Your actions speak
so loud, I can’t hear a word you’re saying.”

-Greg Graffin (Bad Religion)

I’ve not tested this too much, but what I tried was to setup a proxy
object that would insert a hash if the [] method is called on it, or
an array if the << method was called:

class ProxyDefault
def initialize hash, key
@hash = hash
@key = key
end

def
@hash[@key] = Hash.new {|hash,key| ProxyDefault.new(hash, key)}
@hash[@key][key]
end

def << value
@hash[@key] = []
@hash[@key] << value
end
end

h = Hash.new {|hash,value| ProxyDefault.new(hash, value)}

h[1][2][3] << “value”

p h
p h[1][2][3]

/temp$ ruby nested_hash_array.rb
{1=>{2=>{3=>[“value”]}}}
[“value”]

Hope this helps,

Jesus.

Glen_H · January 25, 2010, 10:30pm

On Jan 25, 2010, at 2:24 PM, Glen H. wrote:

I’ll play around with your solution. I have the following:

data = Hash.new { |l, k| l[k] = Hash.new { |l, k| l[k] = Hash.new { |l, k|
l[k] = Hash.new([]) }}}

I’m assuming you want ‘infinite’ depth. Consider:

default = lambda { |h,k| h[k] = Hash.new(&default) }
top = Hash.new(&default)

Gary W.

Glen_H · January 25, 2010, 8:25pm

2010/1/25 JesÃºs Gabriel y GalÃ¡n [email protected]

I can get close but I can’t get it right. The data is going to be coming
I’ve not tested this too much, but what I tried was to setup a proxy
@hash[@key] = Hash.new {|hash,key| ProxyDefault.new(hash,
h = Hash.new {|hash,value| ProxyDefault.new(hash, value)}
Hope this helps,

Jesus.

Thanks Jesus,

I’ll play around with your solution. I have the following:

data = Hash.new { |l, k| l[k] = Hash.new { |l, k| l[k] = Hash.new { |l,
k|
l[k] = Hash.new([]) }}}

Believe me I know it’s ugly and not in any way flexible. Plus it
behaves in
ways which make me uncomfortable when I try to print the contents.

–
“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I canâ€™t hear a word youâ€™re saying.”

-Greg Graffin (Bad Religion)

Glen_H · January 25, 2010, 10:50pm

2010/1/25 JesÃºs Gabriel y GalÃ¡n [email protected]

I’m assuming you want ‘infinite’ depth. Consider:

default = lambda { |h,k| h[k] = Hash.new(&default) }
top = Hash.new(&default)

The problem is that he wants the leaves of the hash to be arrays, and
not hashes.

Jesus.

Exactly, infinite depth would be nice as it would make a more temporally
portable solution. The proxy looks to be working great. I am a bit
confused as to why the << method in the proxy doesn’t overwrite a leaf
with
a new array though. I’m not complaining as it works the way I want it
to,
I’m just perplexed.

Thanks Jesus.

–
“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I canâ€™t hear a word youâ€™re saying.”

-Greg Graffin (Bad Religion)

Glen_H · January 25, 2010, 10:43pm

On Mon, Jan 25, 2010 at 10:28 PM, Gary W. [email protected] wrote:

default = lambda { |h,k| h[k] = Hash.new(&default) }
top = Hash.new(&default)

The problem is that he wants the leaves of the hash to be arrays, and
not hashes.

Jesus.

Glen_H · January 26, 2010, 4:23am

On Jan 25, 2010, at 4:49 PM, Glen H. wrote:

2010/1/25 Jesús Gabriel y Galán [email protected]

The problem is that he wants the leaves of the hash to be arrays, and
not hashes.

Exactly, infinite depth would be nice as it would make a more temporally
portable solution. The proxy looks to be working great. I am a bit
confused as to why the << method in the proxy doesn’t overwrite a leaf with
a new array though. I’m not complaining as it works the way I want it to,
I’m just perplexed.

Oops. Sorry for the confusion. The trick with the proxy is that the
first time << is called on the proxy, it replaces itself with an empty
array. Further lookups will return the array and not the original
proxy.

Gary W.

Glen_H · January 26, 2010, 3:30pm

2010/1/26 JesÃºs Gabriel y GalÃ¡n [email protected]

The proxy looks to be working great. I am a bit
h[1][2][3] will return that array and no proxy objects anymore. It
@hash = hash
@hash[@key][key]
end
the hash is: {} when calling << on the proxy object
the hash is: {2=>[“value”]} after replacing the proxy with an array
{1=>{2=>[“value”]}}
[“value”]

Jesus.

Sorry, I should have been more specific when stating my confusion. I am
confused as to why appending a second item into a leaf results in a
multi-item array rather than a new array with only the second item.

data[1][2][3] << 4
data[1][2][3] << 5

yields
{1=>{2=>{3=>[4,5]}}}
looing at the proxy I was expecting
{1=>{2=>{3=>[5]}}}

The behavior I’m seeing is what I want I just didn’t expect it. From
the
code it looks like << assigns an array to the key then appends a value.
I
was expecting that to overwrite the array created with the first << call
at
that level with a new single item array.

–
“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I canâ€™t hear a word youâ€™re saying.”

-Greg Graffin (Bad Religion)

Glen_H · January 26, 2010, 10:01am

On Mon, Jan 25, 2010 at 10:49 PM, Glen H. [email protected]
wrote:

Exactly, infinite depth would be nice as it would make a more temporally
portable solution.

With it, you have infinite depth, until in a branch you decide to stop
by appending (<<) a value.
Then you fix the depth of that branch.

The proxy looks to be working great. I am a bit
confused as to why the << method in the proxy doesn’t overwrite a leaf with
a new array though. I’m not complaining as it works the way I want it to,
I’m just perplexed.

When you access h[1][2][3], a proxy object is inserted in the hash for
that key. The proxy object remembers the hash and the key. When you
call << on the proxy object, it replaces itself in the hash with an
empty array, to which it appends the value. So further calls to
h[1][2][3] will return that array and no proxy objects anymore. It
works the same for the upper levels: calling h[1] inserts a proxy in
the hash. When you call [] on it (for example h[1][2]) it replaces
h[1] with a hash.

Maybe this clarifies a bit more:

/temp$ cat nested_hash_array.rb && ruby nested_hash_array.rb
class ProxyDefault
def initialize hash, key
@hash = hash
@key = key
end

def
puts “the hash is: #{@hash.inspect} when calling [] on the proxy
object”
@hash[@key] = Hash.new {|hash,key| ProxyDefault.new(hash, key)}
puts “the hash is: #{@hash.inspect} after replacing the proxy with a
hash”
@hash[@key][key]
end

def << value
puts “the hash is: #{@hash.inspect} when calling << on the proxy
object”
@hash[@key] = [value]
puts “the hash is: #{@hash.inspect} after replacing the proxy with
an array”
@hash[@key]
end
end

h = Hash.new {|hash,value| ProxyDefault.new(hash, value)}

h[1][2] << “value”

p h
p h[1][2]

the hash is: {} when calling [] on the proxy object
the hash is: {1=>{}} after replacing the proxy with a hash
the hash is: {} when calling << on the proxy object
the hash is: {2=>[“value”]} after replacing the proxy with an array
{1=>{2=>[“value”]}}
[“value”]

Jesus.

Glen_H · January 26, 2010, 4:35pm

On Tue, Jan 26, 2010 at 3:29 PM, Glen H. [email protected]
wrote:

Then you fix the depth of that branch.
call << on the proxy object, it replaces itself in the hash with an
class ProxyDefault
puts "the hash is: #{@hash.inspect} after replacing the
@hash[@key]
the hash is: {} when calling [] on the proxy object
confused as to why appending a second item into a leaf results in a
The behavior I’m seeing is what I want I just didn’t expect it. From the
code it looks like << assigns an array to the key then appends a value. I
was expecting that to overwrite the array created with the first << call at
that level with a new single item array.

I understood your question, so this means I explained myself really
badly :-).
When you do this:

h[1] << 4

The following things happen:

The method [] of h is called with parameter 1
The hash detects that there’s no entry for that key, and so calls
the default proc
The default proc inserts a Proxy object in the hash for that key
(this proxy object remembers the hash and the key)
The result of the default proc (which is the proxy object itself) is
returned
The method << with parameter 4 is called on the proxy object
That method removes the proxy object from the hash and replaces
itself with an array with element 4 inside.

From now on, every time you call h[1] there is actually a value in the
hash, which is the array created by the proxy, and so the hash doesn’t
call the default proc anymore, and no other proxy object is involved.
Subsequent calls to h[1] << some_value will actually call the <<
method of the array.

Hope this clears up the issue a little bit more.

Jesus.

Glen_H · January 26, 2010, 6:07pm

2010/1/26 JesÃºs Gabriel y GalÃ¡n [email protected]

the
           puts "the hash is: #{@hash.inspect} when calling << on
[“value”]
that level with a new single item array.
The hash detects that there’s no entry for that key, and so calls
hash, which is the array created by the proxy, and so the hash doesn’t
call the default proc anymore, and no other proxy object is involved.
Subsequent calls to h[1] << some_value will actually call the <<
method of the array.

Hope this clears up the issue a little bit more.

Jesus.

No, you didn’t acutally. I just wasn’t thinking about it properly.
When I
actually think about your explanation further the whole no more proxy
object makes perfect sense and answers my question.

Thanks for your patience and help Jesus. It is appreciated.

–
“Hey brother Christian with your high and mighty errand, Your actions
speak
so loud, I canâ€™t hear a word youâ€™re saying.”

-Greg Graffin (Bad Religion)

bkatzung · July 15, 2019, 12:09am

Since ~2014, you can also use the XKeys Gem.

require 'xkeys'

data = {}.extend XKeys::hash
data[2, 3, 4, 5, :[]] = 3 # :[] is the next array index
data[2, 3, 4, 5, :[]] = 4
# {2=>{3=>{4=>{5=>[3, 4]}}}}
data[1] # nil
data[1, :else => []] # []
# Note: data[int1, int2] will try to array slice
# Use data[1, 2, {}] (empty option hash) for data[1][2]