On Sat, 2 Dec 2006, Paul L. wrote:
cat list | minmax
Disagree.
It’s a bit too late to disagree, in the face of the evidence that I said it,
then I did it.
i agree that it’s easy to emulate awk, but shouldn’t we do something
better in
ruby? i’m personally always inspired by ruby’s elegance to write
something
better and more exstensible than something i could easily do in the
shell/awk/perl/c/etc and find that, over the long run (say more than 3
days)
i’ve found that my productivity increases in an exponential way if i
simply
embrace ruby’s power to write clear and re-usable code and code it right
‘the
first time.’ imho it’s a shame to write throw-away scripts in ruby.
here’s what i’ve got so far: the concept is that each line may contain
‘n’
columns of numbers, which is to say the input is not a simple list of
numbers,
but a list of rows of numbers: a table. any non-numeric data is
ignored,
eliminating the need to grep out crud. also, integer arithmitic is
attempted
where possible but the code falls back to floats when needed. all
numeric
input must be valid - no use of #to_i or #to_f, preferring Integer() and
Float(). the code abstracts all of the input, computation, and output
functions and is user-extensible via the use of duck-typed filters.
it’s also
usable both as a library or from the command-line
first some examples of usage:
mussel:~/eg/ruby/listc > cat input.a
1
2
3
mussel:~/eg/ruby/listc > ./listc sum < input.a
6
mussel:~/eg/ruby/listc > ./listc mean < input.a
2.0
mussel:~/eg/ruby/listc > cat input.b
1 2
3 4
5 6
mussel:~/eg/ruby/listc > ./listc median < input.b
3.0 4.0
mussel:~/eg/ruby/listc > cat input.c
foo 1 bar 2
a 3 b 4
x 5 y 6
mussel:~/eg/ruby/listc > ./listc minmax < input.c
1:5 2:6
mussel:~/eg/ruby/listc > ./listc min < input.c
1 2
mussel:~/eg/ruby/listc > ./listc max < input.c
5 6
mussel:~/eg/ruby/listc > cat input.d
- elapsed : 770.1453289
- elapsed : 620.9993257
- elapsed : 1440.629573
mussel:~/eg/ruby/listc > ./listc mean < input.d
943.924742533333
now the code (i’m not golfing, for you non-vim users strange markers are
‘folds’: those lines appear as one single line to me):
mussel:~/eg/ruby/listc > cat ./listc
#! /usr/bin/env ruby
class Main
#–{{{
OPS = %w( sum add mean avg median max min minmax )
def main
op = ARGV.shift.to_s.strip.downcase
klass =
case op
when 'sum', 'add'
SumFilter
when 'mean', 'avg'
MeanFilter
when 'median'
MedianFilter
when 'minmax'
MinMaxFilter
when 'max'
MaxFilter
when 'min'
MinFilter
else
abort "bad op <#{ op }> not in <#{ OPS.join ',' }>"
end
filter = klass.new
$stdin.each{|line| filter << line}
filter.result >> $stdout
end
#–}}}
end
def Main(*a, &b) Main.new(*a, &b).main end
module FilterUtils
#–{{{
def extract_numbers line
fields = line.strip.split(%r/\s+/)
fields.map{|f| Integer(f) rescue Float(f) rescue nil}.compact
end
class List < Array
def >> port = STDOUT
port << join(' ')
port << "\n"
end
def self.from other
new.instance_eval{ replace other; self }
end
end
def new_list l = nil
l ? (List === l ? l : List.from(l)) : List.new
end
class MultiList < Array
def >> port = STDOUT
port << map{|elem| elem.join(':')}.join(' ')
port << "\n"
end
def self.from other
new.instance_eval{ replace other; self }
end
end
def new_multilist ml = nil
ml ? (MultiList === ml ? ml : MultiList.from(ml)) : MultiList.new
end
#–}}}
end
class SumFilter
#–{{{
include FilterUtils
attr ‘sum’
def initialize
@sum = new_list
end
def << line
numbers = extract_numbers line
numbers.each_with_index do |n,i|
@sum[i] ||= 0
@sum[i] += n
end
end
def result
@sum
end
#–}}}
end
class MeanFilter
#–{{{
include FilterUtils
attr ‘sum’
attr ‘count’
def initialize
@sum = new_list
@count = new_list
end
def << line
numbers = extract_numbers line
numbers.each_with_index do |n,i|
@sum[i] ||= 0
@count[i] ||= 0
@sum[i] += n
@count[i] += 1
end
end
def result
mean = new_list
@sum.zip(@count){|s,c| mean << (s.to_f/c.to_f)}
mean
end
#–}}}
end
class MedianFilter
#–{{{
include FilterUtils
attr ‘min’
attr ‘max’
def initialize
@min = new_list
@max = new_list
end
def << line
numbers = extract_numbers line
numbers.each_with_index do |n,i|
@min[i] ||= n
@min[i] = [ @min[i], n ].min
@max[i] ||= n
@max[i] = [ @max[i], n ].max
end
end
def result
median = new_list
@min.zip(@max){|mi,ma| median << (mi + ((ma - mi)/2.0))}
median
end
#–}}}
end
class MinMaxFilter
#–{{{
include FilterUtils
attr ‘min’
attr ‘max’
def initialize
@minmax = new_multilist
end
def << line
numbers = extract_numbers line
numbers.each_with_index do |n,i|
@minmax[i] ||= [n,n]
@minmax[i][0] = [ @minmax[i][0], n ].min
@minmax[i][1] = [ @minmax[i][1], n ].max
end
end
def result
@minmax
end
#–}}}
end
class MinFilter < MinMaxFilter
#–{{{
def result
new_list @minmax.map{|minmax| [minmax.first]}
end
#–}}}
end
class MaxFilter < MinMaxFilter
#–{{{
def result
new_list @minmax.map{|minmax| [minmax.last]}
end
#–}}}
end
Main() if FILE == $0
of course this cod isn’t perfect, but if i’m going to spend time adding
a list
of numbers i’m going to put in at least this much effort.
kind regards.
-a