SQLite3::BusyException on database seed?

This is a pretty common error, but for me, the problem isn’t going away. When I run rake db:{drop,create,migrate,seed} or rake db:seed ActiveRecord raises:

#<ActiveRecord::StatementInvalid: SQLite3::BusyException: database is locked>

The fault is my seeds.rb:

require 'etc'

NAME = (?a..?y).to_a.permutation
ITEMS = 200
PROCESSORS = Etc.nprocessors * 2

FORK_PIDS = []
time = Time.now

ITEMS.times do |i|
	elap = Time.now - time
	est_rem = ITEMS.*(elap)./(i).-(elap).round(2)
	print "\e[2K#{i}/#{ITEMS} | Elapsed #{elap.round(2)}s | Rem #{est_rem}s \r"

	if FORK_PIDS.length > PROCESSORS
		Process.wait(FORK_PIDS.shift)
		redo
	else
		name = NAME.next.join.capitalize
		FORK_PIDS << Process.fork { User.create!(username: name, password: 'randomuser') rescue (p $!) }
	end
end

FORK_PIDS.each(&Process.method(:wait))

I used to run this code fine, but now it’s not working even once.

My database.yml looks like this:

default: &default
  adapter: sqlite3
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  timeout: 5000

development:
  <<: *default
  database: db/development.sqlite3
  timeout: 30000

test:
  <<: *default
  database: db/test.sqlite3

production:
  <<: *default
  database: db/production.sqlite3

So in development mode, the timeout is set to 30s. But the insertion doesn’t wait for a couple of seconds. If I seed 200 entries, I get the error 5 - 6 times.

The issue can be solved by wrapping the code inside fork in a begin rescue block:

		FORK_PIDS << Process.fork do
			begin
				User.create!(username: name, password: 'randomuser')
			rescue Exception
				sleep 0.01
				retry
			end
		end

This works fine, but I am not sure if it’s a good practice…

What’s the way to get around this problem? I know it’s the issue of fork, but I want to keep fork and set timeout for the database. How do I do that?

Ok, so the code:

require 'etc'

ITEMS = 200
NAME = (?a..?y).to_a.permutation
FORK_PIDS = []
PROCESSORS = Etc.nprocessors * 2

time = Time.now
ITEMS.times do |i|
	if FORK_PIDS.length > PROCESSORS
		Process.wait(FORK_PIDS.shift)
		redo
	else
		elap = Time.now - time
		est_rem = ITEMS.*(elap)./(i).-(elap).round(2)
		print "\e[2K#{i}/#{ITEMS} | Elapsed #{elap.round(2)} s | Rem #{est_rem} s | Active #{FORK_PIDS.count}\r"
		name = NAME.next.join.capitalize

		FORK_PIDS << Process.fork do
			begin
				User.create!(username: 'name', password: 'randomuser')
			rescue ActiveRecord::RecordInvalid
				puts $!.full_message
			rescue Exception
				sleep 0.1
				retry
			end
		end
	end
end

FORK_PIDS.each(&Process.method(:wait))

Works absolutely fine, but still, I know the rescue Exception is called often, and it retries…

SQLite isn’t a client-server database, so it doesn’t have the same capabilities for parallel/concurrent processing. When you fire up a whole bunch of forked processes, each with their own connection, they all serialise on the same resources. So in this case, the forks aren’t making it go faster, all the context switching and contention for the same resource is making it slower .

To be honest, I am learning rails, I am noob. But I did what makes sense. I created fork, and rather than seeding on 1 core, I am doing it on 4 cores, it should be theoretically faster.

Let’s be unbiased, sit back, run the code with and without fork, and compare the time.

Non Forked Version:


Forked Version:

The literal time taken is 37 seconds, at most. The time differs 2s in the next result because it involves all cores, 100% of them, so the turbulence is more :wink:


So the fork version takes 37 seconds, the non-fork version takes 80 seconds. So that’s about 2x more time.


So there’s a difference in them. Now consider, my system has 2 core processor, 4 cores with hyperthreading enabled.

While running the forked version, I opened the packet of a snacks.
While running the non-forked version, I finished eating the snacks :wink:

That is to say, for the forked version, the user-time 1m:42s is because it’s actually using all cores for 37 seconds, but it’s not the literal time taken by the program. The real time is what literally taken. If I had no hyperthreading, the time would probably be half of 1m:42s, not sure, if I disable hyperthreading, I will let you know.


So you now know why 2x time difference? Because I have 2 core processor. Hyperthreading does nothing, it just pipelines programs. So if I use core-0 100%, core-1 will do nothing because the main core is already at 100% load. Same goes for core-3, and core-4. Core 3 is under 100% load, so the hyperthreaded core-4 does nothing much. I bet I will get same results by disabling hyperthreading.

If your system has 8 cores (no hyperthreading), you will get 8x speed assuming your storage and memory can handle that load.

Can’t argue with that evidence :slightly_smiling_face:

But the point I was trying to make is that at some stage SQLite3 has to serialise the access to the CREATE USER function so that it can safely and atomically make the update, and I was speculating that this might be the cause of the BusyException you were seeing before.