Hi all
I’m using a_a_f in rails with a StemmingAnalyzer, in the index and in my
search. I got the idea from this topic:
http://www.ruby-forum.com/topic/80178
I’m having a problem with some search terms - i narrowed one of them
down to the inclusion of the word ‘fly’. Can anyone give me any clues
at to what might be happening, or even how i can investigate?
My index is set up like this:
acts_as_ferret({ :store_class_name => true,
:analyzer => Ferret::Analysis::StemmingAnalyzer.new,
:fields => {:name => { :boost => 2.0 },
…
}})
And this analyzer is defined in a module thus:
module Ferret::Analysis
class StemmingAnalyzer
def token_stream(field, text)
StemFilter.new(StandardTokenizer.new(text))
end
end
end
Now, here’s a search without using the analyzer:
TeachingObject.find_with_ferret(“flea fly”, :per_page => 2000).size
=> 14
And with the analyzer:
TeachingObject.find_with_ferret(“flea fly”, :per_page => 2000, :analyzer => Ferret::Analysis::StemmingAnalyzer.new).size
=> 0
Now, for other searches, the analyzer seems to be doing it’s job nicely.
EG i have lots of resources with the word ‘brass’. With the analyzer, a
search for ‘brasses’ brings all these resources back, while without the
analyzer i don’t get any of them: that’s all fine, it’s working out
that ‘brasses’ and ‘brass’ are equivalent searches.
So what’s going on with the word ‘fly’? It’s definitely this word
because if i change one of the “flea fly” resources to be called “flea
walk” then a search for ‘flea walk’ brings it back, as does a search for
‘flea walks’.
I’m guessing that the analyzer takes a word and converts it into other
terms, or some symbols or something, and searches with that combined
set, and during this process the orginal word ‘fly’ gets lost somewhere.
But, i don’t know where to look to monitor this process.
Any help/advice/clues very welcome…
thanks
max