The frequency count returned by my ferret reader doesn’t decrement
after I remove a documents with those terms. Using the example from http://ferret.davebalmain.com/api/classes/Ferret/Index/TermEnum.html
the frequency increments after a document is added but stays the same
after a document is deleted.
index.reader.terms(:tags).each do |term, freq|
“#{term} appears #{freq} times”
end
If I iterate through each document matched by terms_for I get the
correct frequency but I assume at a higher performance cost.
index.reader.terms(:tags).each do |term|
freq = index.reader.terms_for(:tags, term).each{}
“#{term} appears #{freq} times”
end
I’m wondering if I’m plain just doing something wrong. I’m running the
gem version 0.11.6 (ruby) on i686-darwin9.1.0 and I can provide a unit
test if it’d help.
Indeed looks like a bug. I’ve gone through a small hell recently
because of a similar issue =)
index.size also suffers from the same problem. Apparently values for
num_docs (or you tell me what it is exactly if I’m getting it wrong)
get cached in IndexReader and when you call it, it returns values that
are not necessarily consistent with what’s actually in the index.
Also in this same situation, index.optimize before index.size solves
the problem.
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.