I have had to fix a few issues with AAF in order to get it working
well for myself in a production environment. I’m using the latest
“release” version which is 0.4.1:
- When there is no index in place, every request starts a new
rebuild. While there is some code in place to allow this to happen
during testing, personally I see no reason to even test for this,
although maybe it’s necessary for those using multiple indexes (?).
At the very least, the same index should never be built twice at the
same time I hope, so I put in rudimentary code that just locks until
the index is complete on a request.
The real problem is that all the rebuilds use the same FERRET_INDEX/
rebuild path, which causes the dRB server to core dump and massive
CPU load as two reindexes are running and files are replaced
underneath them. This may be the reason for a lot of stability
complaints, as I think a lot of people just remove their index
instead of calling rebuild.
- Performance degrades when I index articles until I call optimize
on the index. Optimize can take many seconds and seems to lock all
access via the dRb server. I added logic to use a separate index for
modifications (adds/deletes) and optimizations. It required
significant hacking of the AAF plug-in. I basically have a writable
index that is then copied each time to a new read-only index
location, followed by changing the index_dir in AAF to the new read-
only index. This prevents the slowdown during indexing, but
everything still seems to lock during optimizations. I have a faster
server now, so optimizations only take around 6 seconds. I may have
to use a separate dRb server to do the optimizations. I am not sure
where this locking occurs. I would like to see aaf take take or this
locking issue somehow.
None of the above are ready to be checked in publicly but I’d be
happy to send a patch if someone wants to base some work on it.
Other than these issues, aaf/ferret have been excellent, and
basically “just worked”. I am able to handle around 40 requests per
second, including rendering a results page. I haven’t finished
performance testing as that is more than enough performance for me
right now.
-Alex