Timeout::timeout(time) question - or.. What am I missing here?

OK, I have a rake task that runs a bit of code which connects up from
a Rails App to a remote data source over a WAN to replicate down some
data from an Oracle server through the Ruby OCI8 library.

This task is expected to take about 1 second per item replicated. I
do it in 1000 item lots. Occasionally the line hangs, or the
connection drops or some part of hell gets too cold, and I loose the
link, at which point we get hanging processes waiting for failures.
To handle this I wrap this whole each block in a time out block set to
2000 seconds.

Each replication is atomic, so as it completes one, it marks it done.
Any that don’t get marked completed get re-replicated. Replicating
the same thing twice isn’t a problem (but I am pretty sure I have
coded around that).

Problem is, it hangs. I have two processes that put time stamps in
their lock files of 16 and 8 hours ago. Longer than 2000 seconds.

The rake task looks like:

task :replicate => :environment do
replicate
end

def replicate
@lock_file = “#{RAILS_ROOT}/tmp/replication.lock”
begin
if File.exist?(@lock_file)
puts “Lock file exists #{@lock_file}”
else
File.open(@lock_file, ‘w’) { |f| f.puts("#{Time.now} - Started
replication")}
@created_lock_file = true
Replication.replicate!
end
ensure
File.unlink(@lock_file) if @created_lock_file
end
end

The replication code looks like this:

def Replication.do_replication
Replication.timeout(2000, “Replication – Timeout error”) do
self.unreplicated.each do |replication|
REPLICATION_LOG.info("#{Time.now} - Replicating id
#{replication.id}")
replication.replicate!
REPLICATION_LOG.flush
end
end
end

def Replication.timeout(time, message)
begin
Timeout::timeout(time) do
yield
end
rescue Timeout::Error => e
REPLICATION_LOG.error("#{Time.now} - Timeout error -
#{message}\n#{e}")
end
end

I can’t see how that is still running after 2000 seconds. Ideas anyone?

Mikel

On Aug 1, 2008, at 9:17 PM, Mikel L. wrote:

I can’t see how that is still running after 2000 seconds. Ideas
anyone?

windows?

a @ http://codeforpeople.com/

On Sat, Aug 2, 2008 at 4:38 PM, ara.t.howard [email protected]
wrote:

On Aug 1, 2008, at 9:17 PM, Mikel L. wrote:

I can’t see how that is still running after 2000 seconds. Ideas anyone?
windows?

Heya Ara,

Nup, on a linux box as a test bed. Running Ruby 1.8.6 patch level 36
on Ubuntu 2.6.22-14

Mikel