25,000 model instances is a lot, but 5 minutes seems too long. Any
suggestions for how to speed this up?
Thanks,
Nick
Since you’re calling k.save each time through the loop, you’re making
25,000 database queries. Don’t do that – instead, build one big insert
query with all the data (or at least do it in batches of 1000 or so).
ar-extensions might help here, or there may be another batch insert
plugin; failing that, building the SQL directly wouldn’t be too
difficult.
In general, a query within a loop is a sure sign that something is
wrong, or at least inefficient. Databases work best when you give them
all the data at once and let them sort out how to handle it.
On Thu, Jun 11, 2009 at 11:24 PM, Ashwin
Mangale[email protected] wrote:
Try commenting out the uniqueness constraint and check the result. To be
fair, the validations listed are fairly standard and can be done before the
keyword object is created ( client-side using something like ajax + jquery
).
Hope this helps.
-Ashwin
Hi Ashwin. I’m finding that the uniqueness constraint doesn’t add much
additional processing time.
You’re right that the validations are quite standard. However, you
can’t rely on input data; it must be validated server-side.
Hi Marnen. I’m surprised I haven’t come across ar-extensions yet. It
looks like the exact solution I need to this problem. Thanks for
recommending it!
-Nick
Hi Ashwin. I’m finding that the uniqueness constraint doesn’t add
much additional processing time.
It’s a bogus constraint, however, because…
You’re right that the validations are quite standard. However, you
can’t rely on input data; it must be validated server-side.
it checks uniqueness in the wrong place. validates_uniqueness_of does
not preclude an interleaving of operations of operations from two
processes that looks like this
A B
create(:keyword => ‘foo’)
validate uniqueness
create(:keyword => ‘foo’)
validate uniqueness
save
save
To be safe, you must have a constraint in the database. If you have
that, the additional app-level validation just wastes performance.
something like ajax + jquery ).
You’re right that the validations are quite standard. However, you
validate uniqueness
–
Michael S. mailto:[email protected]://www.schuerig.de/michael/
Your example is two saves in the same thread/process. The
counter-example
REQUIRES two concurrent processes/threads. And yes his counter-example
is
correct. Unless you can guarantee that your application will NEVER run
multi-process/multi-threaded, e.g., Webrick. If your application will
ever be
run by multiple people at once, it will almost certainly require this.
Such
is the price of success.
Also pushing the uniqueness validation into the database will cut your
run
time. The uniqueness validation is implemented by doing a lookup on the
unique field before doing the insert. Letting the DB do it cuts the
number of
database calls in half. Doing the inserts in batches will also increase
the
speed significantly. Yeah, the code isn’t as pretty but if it is
already
clear you need the speed, do it. It is doubtful that it will be the
ugliest
bit of code in your app.
I had a similar situation where I had to import groups of around 2,500
objects at a time based upon the files found within a directory, back
before ar-extensions was around, and by outputing a text file and then
telling mysql to use that as an infile, the time dropped from a total
of 45 seconds on average to about 2 seconds, for the whole action. I
expected some improvment, but I had no idea how much faster it would
really be until I did it.
25,000 model instances is a lot, but 5 minutes seems too long. Any
suggestions for how to speed this up?
You might also look at your database to see if it can handle a
transaction that large. Depending on how it does it may need to store
quite a lot in memory…
This forum is not affiliated to the Ruby language, Ruby on Rails framework, nor any Ruby applications discussed here.