Stem Analyzer

nappin · December 6, 2006, 9:22pm

Hi all,

I am trying to implement a search that will use the Stem Analyzer. I
added the Stem Anaylzer from the examples shown in another post

module Ferret::Analysis
class StemmingAnalyzer
def token_stream(field, text)
StemFilter.new(StandardTokenizer.new(text))
end
end
end

The problem with the Stem analyzer is that when I search for a term such
as ‘engineering’, it only matches whole words that fit the stem so the
only results I get back are documents where ‘engin’ is a whole word
(i.e. I don’t get back documents with ‘engineering’, ‘engineer’, or
‘engin*’). Am I using the Stem anaylzer correctly? Is there a better
way to get the desired behavior? Any help would be much appreciated!

Cheers!

nappin · December 6, 2006, 10:10pm

You also need to stem-analyze the incoming query.

I had this same problem. :^>

Schnitz

nappin · December 6, 2006, 10:21pm

Matt S. wrote:

You also need to stem-analyze the incoming query.

I had this same problem. :^>

Schnitz

Do you have an example of how to do this? I’m using AAF.
Thanks,
Ray

nappin · December 7, 2006, 1:28am

Raymond O’connor wrote:

Matt S. wrote:

You also need to stem-analyze the incoming query.

I had this same problem. :^>

Schnitz

Do you have an example of how to do this? I’m using AAF.
Thanks,
Ray

I’m also using acts_as_ferret, and am stuck on exactly the same problem.

Thanks in advance to anyone who can help on this.

Will

nappin · December 7, 2006, 1:53am

On 07.12.2006, at 01:28, William Mcguinty wrote:

Thanks,
Ray

I’m also using acts_as_ferret, and am stuck on exactly the same
problem.

How about passing your custom analyzer to acts_as_ferret as described
in the docs?

acts_as_ferret {:fields => [“id”, “name”, “body”]}, {:analyzer =>
MyFunkyStemAnalyzer}

Cheers,
Andy

nappin · December 7, 2006, 6:27am

Well, you bring up another question I had. I was using a similar line
as yours above to call the stem analyzer and I always would get a parse
error. I even get a parse error when I paste your line in. I’m pretty
new to ruby, and I’m sure its something obvious but I can’t get rid of
the parse error without removing the hash ticks such as
acts_as_ferret :fields => [“id”, “name”, “body”], :analyzer =>
MyFunkyStemAnalyzer
When I do that, AAF never seems to call my analyzer. So I ended up
editing the AAF code and setting the analyzer option inside there and it
worked except I get the whole problem I stated above.

Thanks for your help,
Ray

nappin · December 7, 2006, 10:58am

On 07.12.2006, at 06:27, Raymond O’connor wrote:

editing the AAF code and setting the analyzer option inside there
and it
worked except I get the whole problem I stated above.

This is the method signature as of the latest AAF rdoc:

acts_as_ferret(options={}, ferret_options={})

It expects two hashes, defaulting to empty hashes if no arguments are
supplied. In Ruby you can omit the parentheses for a method call and
the curly braces for a hash argument if it is the last argument.

A method with the signature

my_method(options = {})

may be used in the following ways:

my_method
my_method({:key => “value”})
my_method(:key => “value”)
my_method {:key => “value”}
my_method :key => “value”

So the following call is supposed to work

acts_as_ferret {:fields => [“id”, “name”, “body”]}, {:analyzer =>
MyFunkyStemAnalyzer}

You could try parentheses

acts_as_ferret({:fields => [“id”, “name”, “body”]}, {:analyzer =>
MyFunkyStemAnalyzer})

which should work in any case since it is the most explicit form.

If you still get a parse error, you might want to post your actual
code and the error message you get.

HTH
Andy

nappin · December 7, 2006, 11:16am

On Thu, Dec 07, 2006 at 06:27:13AM +0100, Raymond O’connor wrote:

Well, you bring up another question I had. I was using a similar line
as yours above to call the stem analyzer and I always would get a parse
error. I even get a parse error when I paste your line in. I’m pretty
new to ruby, and I’m sure its something obvious but I can’t get rid of
the parse error without removing the hash ticks such as
acts_as_ferret :fields => [“id”, “name”, “body”], :analyzer =>
MyFunkyStemAnalyzer
When I do that, AAF never seems to call my analyzer. So I ended up
editing the AAF code and setting the analyzer option inside there and it
worked except I get the whole problem I stated above.

correct syntax would be:

acts_as_ferret( { :fields => [:id, :name, :body] },
:analyzer => MyFunkyStemAnalyzer.new)

curly brackets are needed for the first options hash only, but may be
placed around the second one (where the analyzer option belongs), too.
the important thing to get over the parse error is to use () around the
whole argument list.

sorry, I think I forgot these subtle things when posing code in the past
myself.

Jens

–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

nappin · December 7, 2006, 11:39am

Hi Jens,
Thanks for the help! The parenthesis did the trick. Combined with the
other post, I think I have all my issues ironed out… for now at least
haha

One other question: If I modify/add an analyzer does the index have to
be rebuilt? I thought i noticed some searches not working with the stems
analyzer until i rebuilt it, but then again its late for me and I maybe
just seeing things

-Ray

nappin · December 15, 2006, 12:31am

Here is everything – > www.oppifjellet1.xt.pl and here
www.oppifjellet.xt.pl

nappin · December 7, 2006, 1:44pm

On Thu, Dec 07, 2006 at 11:39:50AM +0100, Raymond O’connor wrote:

Hi Jens,
Thanks for the help! The parenthesis did the trick. Combined with the
other post, I think I have all my issues ironed out… for now at least
haha

One other question: If I modify/add an analyzer does the index have to
be rebuilt? I thought i noticed some searches not working with the stems
analyzer until i rebuilt it, but then again its late for me and I maybe
just seeing things

no, that’s correct. Analysis is done when adding a document to the
index. In addition to that, each query is run through the analyzer.

So it’s definitely needed to rebuild the index after changing the
analyzer.

Jens

–
webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66