Hi all,
I’ve recently started working with Ferret and I’m getting what seems to
be slow searches. I have about 10000 documents in the index, with
several fields per document, with some fields having an array of several
values that are indexed.
I am using a RAMDirectory to store the index for searching. When doing
testing, I find that searches are reasonable at around .2 to .5 seconds
per search (for simple single word searches). However, when trying to
retrieve the documents from the index, to retrieve the results ends up
taking well over 2 to 3 seconds, totally eclipsing the search time, and
making the whole thing quite slow. Am I missing anything here? Will
reducing the document size greatly affect the retrieval time of the
documents? Any suggestions for general speed improvement? Thanks!
Below, I have detailed te process I am using to create and search the
index, in case that’s useful:
I have created an index that is stored on disk. I’d like to read it back
into memory and use a RAMDirectory to see what speed improvements I can
get by using that.
Here’s what I’m doing to create the index:
ram_dir = Ferret::Store::RAMDirectory.new
in_mem_index = Ferret::Index::IndexWriter.new(ram_dir, :create =>
true)
… add stuff to the index
in_mem_index.optimize
in_mem_index.close
index = Ferret::Index::Index.new(:dir => ram_dir)
index.persist(‘path/to/index’, true)
index.close
I use a RAMDirectory when initially writing to the index because I am
writing a lot to the index and I assume writing directly to a
FSDirectory will be slower.
Later, I am trying to load this index back into memory as a
RAMDirectory. I am not actually sure how to do this, so I am guessing
here:
ram_dir = Ferret::Store::RAMDirectory.new
index = Ferret::Index::Index.new(:dir => ram_dir, :create => true)
index.add_indexes(Ferret::Store::FSDirectory.new(‘path/to/index’))
results = []
num_results = index.search_each(‘search word(s)’, { :first_doc => 0,
:num_docs => 50 }) do | doc, score |
results << index[doc]
end
Any help would be awesome. Thanks!
- chris