How to update index from a script

Hello all,
I’m using AAF right now to index my ~3million db records. However, any
additions to these records are added to the database through an external
script so the aaf activerecord hooks will not catch any updates. Since
new records are only added rarely, I figured I could just add the new
records manually in ferret from some type of script. I’ve been looking
at the ferret documentation, but I’m sort of lost about how to update
the aaf ferret index from a ruby script. I was wondering if anyone had
any examples on how to do this.

Thanks!
-Chris

On Tue, Nov 28, 2006 at 07:16:15PM +0100, Chris W. wrote:

Hello all,
I’m using AAF right now to index my ~3million db records. However, any
additions to these records are added to the database through an external
script so the aaf activerecord hooks will not catch any updates. Since
new records are only added rarely, I figured I could just add the new
records manually in ferret from some type of script. I’ve been looking
at the ferret documentation, but I’m sort of lost about how to update
the aaf ferret index from a ruby script. I was wondering if anyone had
any examples on how to do this.

You’d have to mimic the way how aaf uses the index in your script. the
to_doc method in instance_methods.rb should be a goot starting point.

It would be way easier to use ActiveRecord in the script and run
it through script/runner - that way aaf will catch the updates.

Jens


webit! Gesellschaft für neue Medien mbH www.webit.de
Dipl.-Wirtschaftsingenieur Jens Krämer [email protected]
Schnorrstraße 76 Tel +49 351 46766 0
D-01069 Dresden Fax +49 351 46766 66

You’d have to mimic the way how aaf uses the index in your script. the
to_doc method in instance_methods.rb should be a goot starting point.

It would be way easier to use ActiveRecord in the script and run
it through script/runner - that way aaf will catch the updates.

Little up :slight_smile:

Hi !

I have the same problem :

I use a Ruby on rails (Hobo) application with a database filled by
antother ruby script.

I also use acts_as_ferret plugin to offer full text search capability to
my rails application.

This is the indexed model :

class Job < ActiveRecord::Base
acts_as_ferret :additional_fields => [:username, :hostname]

hobo_model

belongs_to :atuser
belongs_to :athost

#— Used to index atUsers and atHosts —#

def username
return atuser.name
end

def hostname
return athost.name
end

etc…
end #~ Job

The problem is that the ferret index is not updated when I add/delete
information in my database (because I use an external application
(written in Ruby) to fill my database and I haven’t acces save/destroy
rails model functions (overloaded by the acts_as_ferret plugin…))


I thought to 2 solutions :

  • WebService (WSDL) that give me acces to save/destroy function of my
    rails app
  • Update Ferret index “On the fly” when inserting new values in
    database.

I prefer the second solution because the first risks to be too slow, and
I need rapidity.

Finaly, the question is : How can I update a Ferret index out of my
rails application ?

Matt V. wrote:

Sorry, I forgot the most important :

Tanks ! :slight_smile:

:wink:

s/Tanks/Thanks…

Sorry, I forgot the most important :

Tanks ! :slight_smile:

:wink:

Hi!

On Wed, Sep 19, 2007 at 03:39:14PM +0200, Matt V. wrote:
[…]

  • WebService (WSDL) that give me acces to save/destroy function of my
    rails application ?
    The safest way would be to talk to Ferret’s DRb server. If you don’t do
    any updates to the index at all through your web app, you might also
    just use plain Ferret to update the index.

Code for adding a record to the index via DRb might look like this:

server = DRbObject.new(nil, ‘druby://localhost:9010’)
server << record.class.name, record.to_ferret_doc

Aaf adds it’s own to_doc method to your AR model, but since you don’t
use AR in your external script, you’ll have to implement your own - see
instance_methods.rb for how aaf does this.

Future versions of aaf might better support this scenario - there are
plans to decouple it from ActiveRecord so it can be used with non-AR
classes. But don’t expect anything in this direction too soon …

cheers,
Jens


Jens Krämer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database

Jens K. wrote:

Hi!

On Wed, Sep 19, 2007 at 03:39:14PM +0200, Matt V. wrote:
[…]

  • WebService (WSDL) that give me acces to save/destroy function of my
    rails application ?
    The safest way would be to talk to Ferret’s DRb server. If you don’t do
    any updates to the index at all through your web app, you might also
    just use plain Ferret to update the index.

Code for adding a record to the index via DRb might look like this:

server = DRbObject.new(nil, ‘druby://localhost:9010’)
server << record.class.name, record.to_ferret_doc

Aaf adds it’s own to_doc method to your AR model, but since you don’t
use AR in your external script, you’ll have to implement your own - see
instance_methods.rb for how aaf does this.

Hi, thanks for your answer !

I’m sorry, but don’t understand why I should connect to this server.

The only thing I want to do is to update my index while filling the
database with my script :

this is the aaf field list: [:username, :athost_id, :atuser_id, :date,
:hostname, :url]

I have the username, the date, the hostname and the url.
I can get the athost_id and atuser_id using a SELECT.

why can’t I just connect to the aff generated index and update it using
ferret itself :

Index already generated by acts_as_ferret (with testing info)

index = Index::Index.new(:path => ‘index’)

index.search_each("*") do |id, score|
puts “Document #{id} found with a score of #{score}”
end

#==> puts nothing !

I use the default aff index dir :

(From act_methods.rb)
:index_dir => “#{ActsAsFerret::index_dir}/#{self.name.underscore}”,

I tryed to connect to
index = Index::Index.new(:path => ‘index/development/job’)

But it doesn’t work anymore :frowning:

Thanks for your help.

On Fri, Sep 21, 2007 at 04:26:44PM +0200, Matt V. wrote:

I have another Idea :

My script which updates database have to be lunch periodically (using
crontab)

What do you think of using a Rake task which to the same job ?

I will have a direct access to my Job ActiveRecord using my rails
environment. It will be easier to add a new Job (No SQL, thanks to AR)
and my ferret index will be updated by aaf!

That’s probably the best way to go.

cheers,
Jens


Jens Krämer
http://www.jkraemer.net/ - Blog
http://www.omdb.org/ - The new free film database

I have another Idea :

My script which updates database have to be lunch periodically (using
crontab)

What do you think of using a Rake task which to the same job ?

I will have a direct access to my Job ActiveRecord using my rails
environment. It will be easier to add a new Job (No SQL, thanks to AR)
and my ferret index will be updated by aaf!

Am I wrong ?

Thanks !