How to define doubles: Array-to-hash

tharrisx · August 9, 2009, 5:54pm

Hi there,

Assume the following array (contains doubles):

original = [1, 2, 3, 3, 3, 3, 4, 4, 5]

What’s the most efficient way to obtain a hash that tells me what
elements of “original” have more than 1 appearance/entry (= doubles)?

The result should look like this:

result = {“3”=>4, “4”=>2}

Thanks for your help!
Tom-n00b

tharrisx · August 9, 2009, 6:11pm

Tom Ha wrote:

result = {“3”=>4, “4”=>2}

Thanks for your help!
Tom-n00b

I won’t argue that it’s the most “efficient” but the following both
works and is easy to understand.

irb(main):007:0> original.each do |o|
irb(main):008:1* o = o.to_s
irb(main):009:1> result[o] ||= 0
irb(main):010:1> result[o] += 1
irb(main):011:1> end
=> [1, 2, 3, 3, 3, 3, 4, 4, 5]
irb(main):012:0> result
=> {“1”=>1, “2”=>1, “3”=>4, “4”=>2, “5”=>1}
irb(main):013:0> result.delete_if {|k, v| v == 1}
=> {“3”=>4, “4”=>2}

tharrisx · August 9, 2009, 7:18pm

Hi,

Am Montag, 10. Aug 2009, 00:54:31 +0900 schrieb Tom Ha:

original = [1, 2, 3, 3, 3, 3, 4, 4, 5]

The result should look like this:

result = {“3”=>4, “4”=>2}

Here are two:

original.inject({}) { |h,e| h[e] ||= 0 ; h[e] += 1 ; h }

and

h = Hash.new { |h,k| h[k] = 0 }
original.each { |e| h[e] += 1 }
h

then

h.reject! { |k,v| k == 1 }

Bertram

tharrisx · August 9, 2009, 7:48pm

Great!

Thanks a lot, guys!

tharrisx · August 10, 2009, 1:17am

Bertram S. wrote:

Here are two:

original.inject({}) { |h,e| h[e] ||= 0 ; h[e] += 1 ; h }

and

h = Hash.new { |h,k| h[k] = 0 }
original.each { |e| h[e] += 1 }
h

then

h.reject! { |k,v| k == 1 }

Bertram

Your code doesn’t produce the desired result, the op wanted an efficient
solution, and this:

h = Hash.new { |h,k| h[k] = 0 }

is equivalent to:

h = Hash.new(0)

How about a regex?

original = [1, 2, 3, 3, 3, 3, 4, 4, 5]
results = {}

str = original.to_s

str.scan(/((\d)\2+)/).each do |match|
results[match[1]] = match[0].length
end

p results

–output:–
{“3”=>4, “4”=>2}

…but that is no faster than:

results = Hash.new(0)

original.each do |elmt|
results[elmt.to_s] += 1
end

results.delete_if {|key, val| val == 1}

p results

–output:–
{“3”=>4, “4”=>2}

tharrisx · August 10, 2009, 8:33am

Hi –

On Mon, 10 Aug 2009, Bertram S. wrote:

h = Hash.new(0)

Arrgh. Of course.

Actually they’re not equivalent.

h = Hash.new(0)
=> {}

h[1]
=> 0

h
=> {}

h = Hash.new {|h,k| h[k] = 0 }
=> {}

h[1]
=> 0

h
=> {1=>0}

David

tharrisx · August 10, 2009, 1:44am

Hi,

Am Montag, 10. Aug 2009, 08:17:42 +0900 schrieb 7stud --:

Bertram S. wrote:

h = Hash.new { |h,k| h[k] = 0 }

h = Hash.new { |h,k| h[k] = 0 }

is equivalent to:

h = Hash.new(0)

Arrgh. Of course.

Bertram

tharrisx · August 10, 2009, 10:27am

Hi,

Am Montag, 10. Aug 2009, 15:32:44 +0900 schrieb David A. Black:

h = Hash.new(0)
=> {}

h = Hash.new {|h,k| h[k] = 0 }
=> {}

h[1]
=> 0

h
=> {1=>0}

Arrgh. Of course.

I remind that my code called Hash#[] just in

h[e] += 1

and therefore it is even more efficient to use the default value
solution.

Bertram

tharrisx · August 11, 2009, 4:13pm

On Mon, Aug 10, 2009 at 1:17 AM, 7stud –[email protected] wrote:

Your code doesn’t produce the desired result, the op wanted an efficient
solution, and this:

Actually, he wanted the “most efficient”.

This is faster than the inject/hash method:

result = []
lastv = nil
count = 1
original.sort.each do |v|
  if v == lastv
    count += 1
  else
    result.push(lastv, count) if lastv
    count = 1
    lastv = v
  end
end
result.push(lastv, count)
hash = Hash[*result]

This is O(n logn+n) while the inject method should be theoretically be
O(n) (assuming that Hash tables lookups/insertions are constant-time
and inject is linear). However, its significantly faster (250%) on
1.8.6-mswin32 and a little faster (50%) on 1.9.1-mingw32, even on
large arrays (millions of elements). Anyone who can explain why?