Hi all,
Just wonder if there is way to group similar(not identical)
elements in an array?
For example, changing old_array into new_array as follow:
old_array = [“John”, “Mike1”, “Bob1”, “Mike2”, “Bob2”]
new_array=[“John”, [“Mike1”, “Mike2”],[ “Bob1”, “Bob2”]]
Thanks,
I don’t have a solution ad-hoc but this sounds as if it could
be solved via a Levensthein formula.
Example:
require ‘rubygems/text’
include Gem::Text
levenshtein_distance ‘shevy’, ‘chevy’ # => 1
I think with that, you can build your criteria.
In your above example, “Mike1” and “Mike2”
would have a distance of 1; so perhaps you can
use this as a criterion.
I don’t know which one would be the ideal, but
try .group or .group_by and .select perhaps.
Perhaps you may have to create intermediary arrays
via different methods and then merge them back
in again, to have an expanded Array.
#Actually I want to group all elements contain ‘Mike’ into one group.
#I write the following script for my own purpose as a prove of
principle.
#Since I cannot find an easy way to group similar elements in one array
#I use two arrays to achieve the purpose.
#Any comments?
#Thanks,
require ‘pp’
def read_files
array1=[‘1_s’,‘2_a’,‘3_’,‘4_’,‘5_’]
array2=[‘11’,‘22’,‘33’,‘111’,‘11111’]
return array1, array2
end
def group_element(a1,a2)
array1=a1
array2=a2
array3=[]
temp=[]
array1.each do |a1|
#pattern to match
m=a1.split(’_’)[0]
array2.each do |a2|
if a2.match("#{m}")
temp<<[a1,a2]
end
end
temp= temp.flatten.uniq
array3<< temp if temp.size>0
single elelemt
if temp.size<1
array3<<a1
end
#empyt temp array
temp=[]
end#array1.each
pp array3
end#def
###########main################
read_files()
array1,array2=read_files()
group_element(array1,array2)
#after read about ‘group_by’ method in ruby and google from Python,
#I come across with my own grouping similar elements in one array as
follow:
def groupby_similar_elements#with some criteria
array1=[‘1_s’,‘2_a’,‘3_’,‘4_’,‘5_’]
array2=[‘11’,‘22’,‘33’,‘111’,‘11111’]
array3=array1+array2
array3=array3.group_by{|e|e[0]}.values
pp array3
end
#the point is to add a user specific condition.
#output
ruby group2.rb
[[“1_s”, “11”, “111”, “11111”], [“2_a”, “22”], [“3_”, “33”], [“4_”],
[“5_”]]
Exit code: 0