Hi all,
I cannot make aaf (rev. 220) use my custom analyzer, despite following
the
indications @
http://projects.jkraemer.net/acts_as_ferret/wiki/AdvancedUsage
To pinpoint the problem, I created a model + a simple analyzer with 2
stop
words : “fax” and “gsm”.
test 1 : model.rebuild_index + model.find_by_contents(“fax”) # fax is a
stop word.
=> I get a result when I should not.
(note : I delete the index directory => I can see the index is
recreated,
index/develop
).
test 2 : insert a ‘raise’ in the token_stream() method => it’s never
thrown.
test 3 : use the standard analyzer, to exclude the 2 stop words => same
wrong result.
class AccessPointKind2 < ActiveRecord::Base
set_table_name "access_point_kinds2"
acts_as_ferret(
{:remote => true, :fields => { :name => {:store => :yes}} }
,
{ :analyzer =>
Ferret::Analysis::StandardAnalyzer.new([“fax”,“gsm”])
}
)
end
Here are the model and the analyzer :
MODEL :
class AccessPointKind2 < ActiveRecord::Base
set_table_name “access_point_kinds2”
acts_as_ferret(
{:remote => true, :fields => { :name => {:store => :yes}} } ,
{:analyzer => PlainAsciiAnalyzer.new}
)
end
ANALYZER
lib : plain_ascii_analyzer.rb
class PlainAsciiAnalyzer < ::Ferret::Analysis::Analyzer
include ::Ferret::Analysis
def token_stream(field, str)
StopFilter.new(
StandardTokenizer.new(str) ,
[“fax”, “gsm”]
)
# raise <<<----- is never executed when uncommented !!
end
end
In the console, I rebuild the index + search for a stop word => I get a
results, when I should not :
reload!; AccessPointKind2.rebuild_index ;
AccessPointKind2.find_by_contents(“gsm”).collect &:name
Reloading…
AccessPointKind2 Columns (0.002963) SHOW FIELDS FROM
access_point_kinds2
Asked for a remote server ? true, ENV[“FERRET_USE_LOCAL_INDEX”] is nil,
looks like we are not the server
Will use remote index server which should be available at
druby://localhost:9010
default field list: [:name]
AccessPointKind2 Load (0.002706) SELECT * FROM access_point_kinds2
WHERE
(access_point_kinds2.id in (‘7’,‘12’,‘13’,‘8’,‘2’))
Query: gsm
total hits: 5, results delivered: 5
=> [“gsm”, “gsm”, “gsm(werk)”, “gsm(privé)”, “gsm(privé)”]
I guess it’s obvious, but I cannot see it.
Help.
Thanks in advance.
Alain