Custom Analyser .. where to put it?

differenthink · September 7, 2007, 4:21pm

Hi,

I m trying to use a custom analyser to add my french stop words… i m
reading the tutorial at :
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage

My problem is that i ve no idea where to put my custom Analyser class
like :

class GermanStemmingAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = FULL_GERMAN_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words), ‘de’)
end
end

Any clue ?

Thanks a lot

Guillaume.

differenthink · September 7, 2007, 4:54pm

Hey …

you mean where to place it in your directory structure?

I place them in lib/ but any place is fine… maybe in model/ …
just make sure it’s in the load_path of rails.

Ben

differenthink · September 7, 2007, 4:57pm

Hi,

#{RAILS_ROOT}/lib is a good place for things like this. If you name your
file correctly, i.e. german_stemming_analyzer.rb, Rails will auto-load
it when you use the class name.

Jens

On Fri, Sep 07, 2007 at 04:21:56PM +0200, Guillaume Differenthink wrote:

include Ferret::Analysis

Thanks a lot

Guillaume.

Posted via http://www.ruby-forum.com/.

Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

–
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

differenthink · September 10, 2007, 12:09am

Hi, i ve tried, but without success…

here is what i did, please tell me if something is wrong :

In my class:

acts_as_ferret({ :fields => {
:nom => {},
:description => {:boost => 0},
:logiciel_nom => {},
:logiciel_id => {},
:difficulte_id => {},
:systeme_nom => {},
:fai_nom => {},
:fai_id =>{},
:lversion_id=>{},
:site_nom => {},
:siteutilise_id => {},
:nom_for_sort => {:index => :untokenized},
:note => {:index => :untokenized},
:visions_count => {:index => :untokenized},
:nb_vu => {:index => :untokenized},
:date_sort => {:index => :untokenized}
}}, :analyzer => FrenchStemmingAnalyzer.new)

My FrenchStemmingAnalyzer.rb (in /lib)…

class FrenchStemmingAnalyzer < Ferret::Analysis::Analyzer
include Ferret::Analysis
def initialize(stop_words = FULL_FRENCH_STOP_WORDS)
@stop_words = stop_words
end
def token_stream(field, str)
StemFilter.new(StopFilter.new(LowerCaseFilter.new(StandardTokenizer.new(str)),
@stop_words), ‘fr’)
end
end

there is no error, i erased the /index/ form previous ferret analysis…
i restarted, did a search… but stil same results that before, even
using only stop words…

I do a fuzzy search with ferret (via find_by_content), could it be the
cause of the problem ?
Or maybe there is an error on the code ?

Thanks you for helping,

Guillaume.

differenthink · September 7, 2007, 5:27pm

thanks, i ll try

Jens K. wrote:

Hi,

#{RAILS_ROOT}/lib is a good place for things like this. If you name your
file correctly, i.e. german_stemming_analyzer.rb, Rails will auto-load
it when you use the class name.

Jens

differenthink · October 1, 2007, 4:50am

acts_as_ferret({ :fields => {
:nom => {},
:description => {:boost => 0},
:logiciel_nom => {},
:logiciel_id => {},
:difficulte_id => {},
:systeme_nom => {},
:fai_nom => {},
:fai_id =>{},
:lversion_id=>{},
:site_nom => {},
:siteutilise_id => {},
:nom_for_sort => {:index => :untokenized},
:note => {:index => :untokenized},
:visions_count => {:index => :untokenized},
:nb_vu => {:index => :untokenized},
:date_sort => {:index => :untokenized}
}}, :analyzer => FrenchStemmingAnalyzer.new)

The syntax for the acts_as_ferret options are a bit odd, the above code
should be the following, notice the extra {} brackets around the
:analyzer => FrenchStemmingAnalyzer.new

acts_as_ferret({ :fields => {
:nom => {},
:description => {:boost => 0},
:logiciel_nom => {},
:logiciel_id => {},
:difficulte_id => {},
:systeme_nom => {},
:fai_nom => {},
:fai_id =>{},
:lversion_id=>{},
:site_nom => {},
:siteutilise_id => {},
:nom_for_sort => {:index => :untokenized},
:note => {:index => :untokenized},
:visions_count => {:index => :untokenized},
:nb_vu => {:index => :untokenized},
:date_sort => {:index => :untokenized}
}}, { :analyzer => FrenchStemmingAnalyzer.new })