Hi,
I’m adding some news articles to a keyed Ferret 0.10.14 index and
encountering quite serious instability when concurrently reading and
writing to the index, even though with just 1 writer and 1 reader
process.
If I recreate the index without a key, concurrent reading and writing
seem to work fine (and indexing is about 10 times quicker
I’m testing by running my indexing script (which retrieves up to 1000
database records using ActiveRecord, adds to the index and exits) and
concurrently manually re-running a search on the index using my Rails
web interface. This is in a dev environment with only 1 user (me) and
about 58000 docs.
The error I get is along the lines of the following, with a different
filename each time:
IO Error occured at <except.c>:79 in xraise
Error occured in fs_store.c:324 - fs_open_input
couldn’ferret_index/development/news_article_versions/_2ih.tix:
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:682:in
initialize' /usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:682:in
ensure_reader_open’
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:385:in
[]' /usr/lib/ruby/1.8/monitor.rb:229:in
synchronize’
/usr/lib/ruby/gems/1.8/gems/ferret-0.10.14/lib/ferret/index.rb:384:in
[]' #{RAILS_ROOT}/app/models/news_article_version.rb:35:in
ferret_search’
#{RAILS_ROOT}/app/models/news_article_version.rb:35:in ferret_search' #{RAILS_ROOT}/app/controllers/news_articles_controller.rb:56:in
search’
It seems to occur roughly once per batch, and usually towards the end of
the batch. I’m not using aaf. I create my keyed index like this:
@@ferret_index = Index::Index.new(:path =>
“#{RAILS_ROOT}/ferret_index/#{RAILS_ENV}/news_article_versions”,
:field_infos => field_infos,
:id_field => :id,
:key => :id,
:default_input_field => :text)
Unkeyed, I just drop the :key option (duh). :id is just the
ActiveRecord id, from an auto_increment field in MySQL.
As a note, when concurrently searching on the keyed index, the number of
hits returned increases throughout the indexing process. With a
non-keyed index, the number of hits doesn’t increase until the end.
It looks to me that when using a keyed index, Ferret commits each record
added. When non-keyed, it commits when the Index is closed. That I
don’t get the error with non-keyed might just be because there are less
commits, so less opportunities for the “bug” to trigger.
Is this is bug I’ve come across? Is concurrent reading/writing like
this expected to work?
I’m using Ferret 0.10.14 on Ubuntu Edgy, with “ruby 1.8.4 (2005-12-24)
[i486-linux]” and “gcc version 4.1.2 20060928”
Thanks in advance!