A use case for an ordered hash

Pit captain:

Sven S. schrieb:

And I am 100% in favour of a order-preserving hash in the Ruby core.
snip
Sven, regarding the “HistoryHash” part of your post:

In order to replace the standard Hash with a “better” Hash, there should
be a clear agreement on its desired behaviour. I don’t think this
agreement has been achieved yet. What should be the result of the
following code?

snip

I’ve not been following this post attentively…

My vote is against the proposal (to add semantics to Hash)- for what
it’s worth :slight_smile:

I am very much in favour of the Hash remaining a Hash. It’s an extremely
useful datatype, and it’s clear what it does and doesn’t do.

Personally, I would very much like to see an ordered container in the
standard distribution. It would behave like the c++ stl’s map or set. (I
think these are commonly implemented on top of a b-tree, but other data
types are possible, I believe.)

Containers like this let you:

  • iterate through entities in an ordered way.
  • have log(n) insert time
  • have log(n) search time
  • let you to query for a close match (not necessarily an exact hit).

A Hash that also remembers insertion order seems like it could be
useful, although I don’t think I’ve needed one (while I have wanted the
container described above). I definitely don’t think such an order
preserving hash should replace what is in the standard though. In the
same way I wouldn’t want to add semantics to Array or Set.

The pragmatic approach (and what I imagine will happen) is that if
someone wants these, they should write them. If they’re then widely
used, they should go in to the standard distribution.

Cheers,
Benjohn

On 8/24/06, Sven S. [email protected] wrote:

And actually, I would prefer the behavior of Hash to be replaced
by the behavior of “HistoryHash”.
Enough people do want this feature,
a history-preserving hash does not cost much more (in terms of memory
and execution time)
and it is consistent with tho old hash behavior.
So, for the sake of simplicity, no new class, let’s just replace
Hash by a “better” Hash.

For the record, I am 100% against this. Even if the semantics of
insertion order preservation could be defined to everyone’s
satisfaction, it most assuredly does have an extra cost, and forcing
people who just want a normal hash to pay that cost is infeasible.

martin

On 8/23/06, Robert D. [email protected] wrote:

fried egg on top

De gustibus non dispudandum esse .
But sounds nice to me :wink:
What exectly is a hash (cuisine not informatics)?


Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Pit C. wrote:

In order to replace the standard Hash with a “better” Hash, there should
be a clear agreement on its desired behaviour. I don’t think this
agreement has been achieved yet. What should be the result of the
following code?

hh = HistoryHash.new
hh[:one] = 1
hh[:two] = 2
hh[:one] = 3
hh.each do |k, v| p k end

Should it be
:one
:two
or
:two
:one
Why should it be the one or the other?

Hi Pit,

the first is the semantics of replacement,
the second the semantics of addition.

I think that replacement should be the normal semantics,
the second can then always be achieved by deleting the key
before assigning the new value.

The other way round, ie if addition were the default semantics,
we were forced to write hh[:one].replace( ), which does not
work for Fixnums.

But anyway, the recent posts induced some second thoughts in me
about having HistoryHash as the standard Hash:

Always having to keep the order can also be a drawback,
for instance:
Future core Ruby support for parallel execution could be such
that it allows different threads to add keys to one hash
and leave it undefined in which order the additions were made.

So, I am inclined to say:
Yes, OK, OK, the “good old” Hash is the cleaner solution.

Although the HistoryHash would have fitted better to my idea of
“parameter-list as Hash”. But perhaps that really was too crazy.

Anyway, the performance cost of a HistoryHash would be
the cost of keeping an extra list of the entries,
this means (only) a pointer per entry.

Regards

Sven

Regards,
Pit

On Thu, 24 Aug 2006, Pit C. wrote:

hh[:one] = 3
:one

Why should it be the one or the other?

harp:~ > cat a.rb
require ‘rubygems’
require ‘alib’
HistoryHash = Alib::OrderedHash

hh = HistoryHash.new
hh[:one] = 1
hh[:two] = 2
hh[:one] = 3
hh.each do |k, v| p k end

harp:~ > ruby a.rb
:one
:two

because it’s based on insertion order. if one wants a new insertion
use:

harp:~ > cat a.rb
require ‘rubygems’
require ‘alib’
HistoryHash = Alib::OrderedHash

hh = HistoryHash.new

hh[:one] = 1
hh[:two] = 2
hh[:three] = 3

hh.delete :two
hh[:two] = 2

hh.each do |k, v| p k end

harp:~ > ruby a.rb
:one
:three
:two

of course that’s my 2 cts.

-a

On 8/24/06, [email protected] [email protected] wrote:

:two

because it’s based on insertion order. if one wants a new insertion use:

That depends on what you see yourself as inserting, though. If you
take it as k=>v pairs, then you can see :one => 1 being replaced by
:one => 3, which has then been inserted after :two => 2.

martin

On Fri, 25 Aug 2006, Martin DeMello wrote:

:one
:two

because it’s based on insertion order. if one wants a new insertion
use:

That depends on what you see yourself as inserting, though. If you
take it as k=>v pairs, then you can see :one => 1 being replaced by
:one => 3, which has then been inserted after :two => 2.

i’d say :one was updated or replaced, not inserted - sorry i wasn’t
clear.

it is arbitrary though…

-a

[email protected] wrote:

The pragmatic approach (and what I imagine will happen) is that if
someone wants these, they should write them. If they’re then widely
used, they should go in to the standard distribution.

Yes. The only catch is that a full implementation (with literals)
requires a change in the Ruby interpreter.

Hal

Pit C. wrote:

hh[:two] = 2
:two
:one

Why should it be the one or the other?

An excellent point, and there are other similar questions
to be answered.

As for replacing Hash with another implementation, I would
be OK with it (if it proved reasonably speedy).

Others would not want it even if it were the same speed
as before.

I think the people calling for an ordered hash are a sizable
group, but still a minority.

Hal

Hal F. wrote:

actions = { /abcd/  => lambda { do_this },

before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.

Hal

Can’t you create your own using mixins and including enumerable or
something like that ?

surf wrote:

Can’t you create your own using mixins and including enumerable or
something like that ?

Yes, but I would lose the convenient literal notation.

Hal

On 8/24/06, Hal F. [email protected] wrote:

surf wrote:

Can’t you create your own using mixins and including enumerable or
something like that ?
Yes, but I would lose the convenient literal notation.

…and which could have been seen by someone who looked at more than
the last two messages of a thread.

-austin

Austin Z. wrote:

On 8/24/06, Hal F. [email protected] wrote:

surf wrote:

Can’t you create your own using mixins and including enumerable or
something like that ?
Yes, but I would lose the convenient literal notation.

…and which could have been seen by someone who looked at more than
the last two messages of a thread.

Heh heh. That’s OK. We were at “that point” in the fugue.

Hal

Hal F. wrote:

surf wrote:

Can’t you create your own using mixins and including enumerable or
something like that ?

Yes, but I would lose the convenient literal notation.

Hal

I was curious about this, so I had to try it:

================================================

class OrderedHash

include Enumerable

def initialize(*parm)
@arr = parm
@hash = {}
parm.each do |p|
p.each do |k,v|
@hash[k] = v
end
end
end

def []=(idx, val)
el = {idx => val}
@arr << el
@hash[idx] = val
end

def
@hash[idx]
end

def each
@arr.each do |e|
e.each do |k,v|
yield k,v
end
end
end

end

need each paramter in {}, could also work out a way to do

[:frog,“green”, :sky, “blue” etc ]

thingColor = OrderedHash.new({:frog => “green”},
{:sky => “blue”},
{:elephant => “pink”},
{:scarf => “red”}

                )

puts “the sky is #{thingColor[:sky]}\n\n”

thingColor[:limo] = “black”
thingColor[:snowman] = “white”

thingColor.each do |k,v|
puts “#{k} = #{v}”
end

On 8/26/06, surf [email protected] wrote:

I was curious about this, so I had to try it:

Note: There are about a dozen implementations of an insertion ordered
hash-like objects. I’ve written one. Hal has probably written one. Ara
Howard has written one, I think.

All that’s being discussed is whether it’s worthwhile making a C
version that has a new literal syntax to create insertion ordered
hash-like objects. The main concerns are (1) what literal syntax, and
(2) what name. Since we’re talking something new, we’re not talking
something that would affect the performance for people who don’t need
the functionality.

Just saying that we can implement this in Ruby – or proviing such an
implementation – isn’t helpful in the least. We know this. We’ve done
this. We’re looking for something different.

-austin

up
I also need ordered hash too,use like {} literal syntax

Austin Z. wrote:

(2) what name. Since we’re talking something new, we’re not talking
something that would affect the performance for people who don’t need
the functionality.

Just saying that we can implement this in Ruby – or proviing such an
implementation – isn’t helpful in the least. We know this. We’ve done
this. We’re looking for something different.

I started to get the idea to such an effect, or that some short hand
notation was desired although on my last post Hal just said it wouldn’t
work due to literal notation.