My AAF tweaks

I have had to fix a few issues with AAF in order to get it working
well for myself in a production environment. I’m using the latest
“release” version which is 0.4.1:

  1. When there is no index in place, every request starts a new
    rebuild. While there is some code in place to allow this to happen
    during testing, personally I see no reason to even test for this,
    although maybe it’s necessary for those using multiple indexes (?).
    At the very least, the same index should never be built twice at the
    same time I hope, so I put in rudimentary code that just locks until
    the index is complete on a request.

The real problem is that all the rebuilds use the same FERRET_INDEX/
rebuild path, which causes the dRB server to core dump and massive
CPU load as two reindexes are running and files are replaced
underneath them. This may be the reason for a lot of stability
complaints, as I think a lot of people just remove their index
instead of calling rebuild.

  1. Performance degrades when I index articles until I call optimize
    on the index. Optimize can take many seconds and seems to lock all
    access via the dRb server. I added logic to use a separate index for
    modifications (adds/deletes) and optimizations. It required
    significant hacking of the AAF plug-in. I basically have a writable
    index that is then copied each time to a new read-only index
    location, followed by changing the index_dir in AAF to the new read-
    only index. This prevents the slowdown during indexing, but
    everything still seems to lock during optimizations. I have a faster
    server now, so optimizations only take around 6 seconds. I may have
    to use a separate dRb server to do the optimizations. I am not sure
    where this locking occurs. I would like to see aaf take take or this
    locking issue somehow.

None of the above are ready to be checked in publicly but I’d be
happy to send a patch if someone wants to base some work on it.

Other than these issues, aaf/ferret have been excellent, and
basically “just worked”. I am able to handle around 40 requests per
second, including rendering a results page. I haven’t finished
performance testing as that is more than enough performance for me
right now.

-Alex

On Nov 18, 2007, at 7:33 PM, Alex Neth wrote:

I have had to fix a few issues with AAF in order to get it working
well for myself in a production environment. I’m using the latest
“release” version which is 0.4.1:

Thanks for sharing Alex.

It seems like all of the issues you described are pretty significant
and would affect most users. Did you ever get around to submitting a
patch? If it’s working for you in production then the patch is good
enough to at least start some discussion.

John