What I want to do is read in a CSV file and produce an output which
lists the unique values from each column in the following format:
Column: ColHeader1
UniqueVal1
UniqueVal2
Column: ColHeader2
UniqueVal1
UniqueVal2
…
What I’m currently getting is output that looks as follows:
Column: ColHeader1
ColHeader1UniqueVal1
ColHeader1UniqueVal2
Column: ColHeader2
ColHeader2UniqueVal1
ColHeader2UniqueVal2
…
For some reason, it is appending the column header to each value and
also printing a blank row to start each column. My code is below. Any
help is much appreciated. Essentially I read the CSV into a hash where
the key is the column header and the element is an array of values from
that column. I then run .uniq! on each array in the hash and print the
results to a file.
require ‘rubygems’
require ‘faster_csv’
infile = “xyz.csv”
uniques = {}
FCSV.open(infile, :headers => true).each do |row|
row.each_with_index do |element,j|
uniques[row.headers[j]] ||= []
uniques[row.header[j]] << element
end
end
uniques.each do |key,element|
element.uniq!
end
File.open(“unique_output.txt”,“w+”) do |out|
uniques.each_key do |key|
out.write “Column: #{key}\n”
uniques[key].each do |element|
out.write " #{element}\n"
end
end
end