Custom sort routine

is it possible to write a custom sort routine for ferret?

I use ferret right now to index all my products. One of the variables
in these product documents is the product popularity, where 1 = best
selling production, 2 = 2nd best, etc…

Right now, I’m just sorting by the popularity column in my search
results, although this doesn’t always provide “good” results, neither
does just sorting by document relevance. I’d like some combination of
the two to sort by. Is it possible to do this efficiently with ferret?

any help would be appreciated,
thanks,
ray

On Thu, Aug 23, 2007 at 03:57:39AM +0200, Raymond O’Connor wrote:

is it possible to write a custom sort routine for ferret?

I use ferret right now to index all my products. One of the variables
in these product documents is the product popularity, where 1 = best
selling production, 2 = 2nd best, etc…

Right now, I’m just sorting by the popularity column in my search
results, although this doesn’t always provide “good” results, neither
does just sorting by document relevance. I’d like some combination of
the two to sort by. Is it possible to do this efficiently with ferret?

Adding boosted ORed clauses that query for specific popularities might
work:

(query AND popularity:1)^3 OR (query AND popularity:2)^2 OR query

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

Thanks for the help.
I’m not sure I fully understand your query, but I dont think that would
work because my popularity variable basically ranges from 1 to ~1.5
million (where a product with popularity 1 is the best selling product
and a popularity with 1.5 million is the worst selling product). Its
similar to an Amazon sales rank if you’re familiar with that.

Correct me if i’m misunderstanding what you’re suggesting though.

Thanks,
Ray

Jens K. wrote:

On Thu, Aug 23, 2007 at 03:57:39AM +0200, Raymond O’Connor wrote:

is it possible to write a custom sort routine for ferret?

I use ferret right now to index all my products. One of the variables
in these product documents is the product popularity, where 1 = best
selling production, 2 = 2nd best, etc…

Right now, I’m just sorting by the popularity column in my search
results, although this doesn’t always provide “good” results, neither
does just sorting by document relevance. I’d like some combination of
the two to sort by. Is it possible to do this efficiently with ferret?

Adding boosted ORed clauses that query for specific popularities might
work:

(query AND popularity:1)^3 OR (query AND popularity:2)^2 OR query

Jens


Jens Kr�mer
webit! Gesellschaft f�r neue Medien mbH
Schnorrstra�e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

On Thu, Aug 23, 2007 at 10:50:30AM +0200, Raymond O’Connor wrote:

Thanks for the help.
I’m not sure I fully understand your query, but I dont think that would
work because my popularity variable basically ranges from 1 to ~1.5
million (where a product with popularity 1 is the best selling product
and a popularity with 1.5 million is the worst selling product). Its
similar to an Amazon sales rank if you’re familiar with that.

In this case you could use RangeQueries instead on the popularity field,
or add a new field that is set according to the popularity, i.e. on a
scale from 1 to 10.

The idea of my example was to let products with a higher popularity
match the higher boosted sub queries, which should lead to a higher
ferret score then.

Jens


Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de

Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa

On Thu, Aug 23, 2007 at 11:51:42PM +0200, Sam G. wrote:

Raymond O’Connor wrote:

Right now, I’m just sorting by the popularity column in my search
results, although this doesn’t always provide “good” results, neither
does just sorting by document relevance. I’d like some combination of
the two to sort by. Is it possible to do this efficiently with ferret?

Another option, although requiring a bit more work for the index, would
be to boost each product by a dynamic value (appropriate to a normalised
popularity perhaps) at index build time. Then you could just search and
popularity would automatically be utilised.

cool, didn’t think of this - sounds better to me than constructing the
complex queries I suggested :slight_smile:

Jens


Jens Krämer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database

Jens K. wrote:

On Thu, Aug 23, 2007 at 11:51:42PM +0200, Sam G. wrote:

Raymond O’Connor wrote:

Right now, I’m just sorting by the popularity column in my search
results, although this doesn’t always provide “good” results, neither
does just sorting by document relevance. I’d like some combination of
the two to sort by. Is it possible to do this efficiently with ferret?

Another option, although requiring a bit more work for the index, would
be to boost each product by a dynamic value (appropriate to a normalised
popularity perhaps) at index build time. Then you could just search and
popularity would automatically be utilised.

cool, didn’t think of this - sounds better to me than constructing the
complex queries I suggested :slight_smile:

Cool. Just today I was looking for a solution like this :slight_smile:

When implementing I stumbled upon a problem indexing with
script/runner Mymodel.rebuild_index

class Mymodel < ActiveRecord::Base
acts_as_ferret :fields => {:name => {:boost => :rating}}

function for determining boost value

def rating
return instance_rating
end
end

This exits with

./script/…/config/…/vendor/rails/railties/lib/commands/runner.rb:47:
./script/
…/config/…/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:136:in
`add_fie
ld’: can’t convert Symbol into Float (TypeError)

Using the function name instead of the :symbol doesn’t work either

./script/…/config/…/vendor/rails/railties/lib/commands/runner.rb:47:
undefined
method `rating’ for Place:Class (NoMethodError)

Raymond O’Connor wrote:

Right now, I’m just sorting by the popularity column in my search
results, although this doesn’t always provide “good” results, neither
does just sorting by document relevance. I’d like some combination of
the two to sort by. Is it possible to do this efficiently with ferret?

Another option, although requiring a bit more work for the index, would
be to boost each product by a dynamic value (appropriate to a normalised
popularity perhaps) at index build time. Then you could just search and
popularity would automatically be utilised.

Sam

Hi!

please see comments below.

On Sun, Aug 26, 2007 at 11:48:16PM +0200, Kasper W. wrote:

Jens K. wrote:

On Thu, Aug 23, 2007 at 11:51:42PM +0200, Sam G. wrote:

Raymond O’Connor wrote:
[…]
When implementing I stumbled upon a problem indexing with

This exits with

./script/…/config/…/vendor/rails/railties/lib/commands/runner.rb:47:
./script/
…/config/…/vendor/plugins/acts_as_ferret/lib/acts_as_ferret.rb:136:in
`add_fie
ld’: can’t convert Symbol into Float (TypeError)

D’uh :wink:
Aaf doesn’t support dynamic per-document or per-field boosts yet, at
least not in the declarative way outlined above. For now, you’ll have to
override the to_doc instance method so you can manually apply the boost.

I’ll add that to aaf soon, just created a ticket:
http://projects.jkraemer.net/acts_as_ferret/ticket/166

cheers,
Jens


Jens Krämer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database