Le samedi 02 décembre 2006 21:07, Joel VanderWerf a écrit :
- the median
know (I am not much of a statistician) require you to know the mean in
advance.
Do you do it in two passes through the data, first getting the mean and
then the stdev? (But this would not work if you are reading data from
stdin and don’t want to cache the data in memory.)
Yes, to compute the standard deviation I need the mean. I fact, there
are
dependancies between the data to compute :
-stddev needs variance
-variance needs the mean, the nb of values, and the sum of the squared
values
-mean needs the sum of the values and the nb of values.
here is the hash I use for this (the nb of values are always computed) :
DEPENDANCIES = {
:histogram => [:table],
:mean => [:sum],
:variance => [:sum, :square],
:deviation => [:sum, :square],
:median => [:table],
:skewness => [:sum, :square, :cube],
:kurtosis => [:sum, :square, :cube, :quad],
} # :nodoc:
Then the method which adds a value is generated to compute these data
each
time one is added.
in this case, the generated method would be :
def add_pixel(value)
@nb_px += 1
@lum_sum += value
@lum_square_sum += value**2’
return self
end
and the mean, variance and deviation methods are avalaible at any time :
def mean
return 0 if @lum_sum.zero?
return @lum_sum / @nb_px.to_f
end
def variance
return 0 if @lum_sum.zero?
return @lum_square_sum / @nb_px.to_f - self.mean**2
end
def deviation
return Math.sqrt(variance)
end
for each value v, we can compute any stat by computing the sum of v,
v2,
v3 and v**4 (kurtosis needs all of them, for example)
Et voila