Hello!
I’m using the German stemming analyzer to index a database using
acts_as_ferret. I have some troube with wildcard queries which I’m
extensivly using (and needing) for autocompleter fields.
The problem is the following:
In this example, the indexed model contains street names. Some of these
names are:
- Alte Bürger
- Alter Fährweg
- Am Alten Vorhafen
- …
So lots of names which Street#find_with_ferret could match. Let’s try:
Street.find_with_ferret “al*”
-> [“Alte Bürger”, “Alter Fährweg”, “Am Alten Vorhafen”, …]
Fine so far. Next:
Street.find_with_ferret “alt*”
-> [“Alte Bürger”, “Alter Fährweg”, “Am Alten Vorhafen”, …]
No let’s add another letter:
Street.find_with_ferret “alt*”
-> []
Whoops, nothing there. It should match all the same list entries. It
looks like this happens to all words added to the index using a stemming
analyzer. Using without wildcards works:
Street.find_with_ferret “alte”
-> [“Alte Bürger”, “Alter Fährweg”, “Am Alten Vorhafen”, …]
Something similar happens with other search terms:
–> Database contains “Rasenweg” (“weg” is stripped away by an analyzer
and also a stopword)
Street.find_with_ferret “rasen*”
-> [] # <-- unexpected
Street.find_with_ferret “rasen”
-> [“Rasenweg”] # <-- expected
Street.find_with_ferret “ras*”
-> [“Rasenweg”] # <-- expected
How can I fix this or how is this usually handled? I need to do queries
like this:
Street.find_with_ferret “(alte~ bü~)||(altebü)”
and it should return “Alte Bürger” in the results. This works when I
reformulate the query to:
Street.find_with_ferret “alte~||bü~||altebü”
but this delivers way too inaccurate results.