Jens K. wrote:
On Fri, Jul 20, 2007 at 10:49:18AM +0200, Kyle Nord wrote:
[…]
I should have actually re-phrased that, I apologize.
Long term solution thinking here… What happens when the load gets too
high on the box the index is currently on, is there a way to scale it
horizontally? Besides using certain app servers to access certain
indexes (ie: photos for searching on app003 → search003), and profiles
app002 → search 002.
Ah ok, I see
First of all - have you already checked if load will ever be too high on
a dedicated search box with your expected traffic? You know, premature
optimization is the root of all evil
Actually the one-process-per-index thing is only necessary for writing
to the index, you can have multiple searchers open on the same physical
index. So if you find a way to give multiple servers access to the same
physical index (i.e. via NAS) you could run searches from multiple
machines, but write access would have to be restricted to one machine.
Shared file systems like NFS or samba aren’t a good idea for
sharing the index.
Of course you could also just duplicate your index across machines for
faster searching, but then you’ll have to take care of syncing it with
the master every now and then of course.
The stock DRb server doesn’t have built in support for any of these
distribution scenarios.
If searches really are equally distributed between the various indexes
putting them on separate machines might be your best option, imho.
Jens
–
Jens Kr�mer
webit! Gesellschaft f�r neue Medien mbH
Schnorrstra�e 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[email protected] | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
Thanks so much for the reply, that’s all I needed to know! I actually
use to use backgroundrb to access the search results before, and render
it with javascript (so kind of an ajax style search) but I had a huge
bottleneck with the drb server. So I’m hoping this new acts_as_ferret
with 0.11.4 ferret will be the ticket, and so far it looks as if it is
=).
Once I get the stats, I’ll fill you in on our load. Currently about
15,000 unique searches a day (peaking between 5-8pm PST), hits load at
1.8… thats with a sun x4100. So during those peak hours if I could
offload some of the processing to other machines, that would be the
best.
Maybe just cron an rsync job for the index every now and again.
BTW: very impressed with the fact that if the drb server cannot be
connected to it defaults to the local index (and if that does not exist
it crashes). But that is a great fail safe, so if you have an rsync job
with the index and just copy it to your app servers every 4am, seems to
be a great “nothing shared” tactic.