Hello,
I’m about to release Parallel::ForkManager 1.1 and while generating
new examples (that use Net::HTTP) for PFM 1.1 features, I catch the
following error sometimes (maybe 10-15% of the time) when I try to
connect to a URL where the host part doesn’t exist or isn’t reachable:
/usr/lib/ruby/1.8/timeout.rb:60:in rbuf_fill': execution expired (Timeout::Error) from /usr/lib/ruby/1.8/net/protocol.rb:134:in
rbuf_fill’
from /usr/lib/ruby/1.8/net/protocol.rb:86:in read' from /usr/lib/ruby/1.8/net/http.rb:2212:in
read_body_0’
from /usr/lib/ruby/1.8/net/http.rb:2173:in read_body' from /usr/lib/ruby/1.8/net/http.rb:773:in
get’
from /usr/lib/ruby/1.8/net/http.rb:1053:in request' from /usr/lib/ruby/1.8/net/http.rb:2136:in
reading_body’
from /usr/lib/ruby/1.8/net/http.rb:1052:in request' from /usr/lib/ruby/1.8/net/http.rb:1037:in
request’
from /usr/lib/ruby/1.8/net/http.rb:543:in start' from /usr/lib/ruby/1.8/net/http.rb:1035:in
request’
from /usr/lib/ruby/1.8/net/http.rb:772:in get' from ./parallel_http_get2.rb:36 from ./lib/parallel/forkmanager.rb:232:in
call’
from ./lib/parallel/forkmanager.rb:232:in start' from ./lib/parallel/forkmanager.rb:232:in
fork’
from ./lib/parallel/forkmanager.rb:232:in start' from ./parallel_http_get2.rb:30 from ./parallel_http_get2.rb:26:in
each’
from ./parallel_http_get2.rb:26
Based on my review of the error message, it would seem to be a simple
case that I should catch (rescue) ‘Exception’ because Timeout::Error
is sending to stderr or such, but when I try to catch ‘Exception’ in
my test program, nothing happens! I never catch the error!
Here’s my test code:
#!/usr/bin/env ruby
require ‘net/http’
require ‘lib/parallel/forkmanager’
save_dir = ‘/tmp’
my_urls = [
‘http://www.cnn.com/index.html’,
‘O'Reilly Media - Technology and Business Training’,
‘Cakewalk’,
‘http://www.asdfsemicolonl.kj/index.htm’
]
my_timeout = 5 # seconds
max_proc = 20
pfm = Parallel::ForkManager.new(max_proc)
pfm.run_on_finish(
lambda {
|pid,exit_code,ident|
print “** PID (#{pid}) for #{ident} exited with code
#{exit_code}!\n”
}
)
my_urls.each {
|my_url|
begin
pfm.start(my_url) {
url = URI.parse(my_url)
out_file = save_dir + '/' + url.host + '.txt';
http = Net::HTTP.new(url.host, url.port)
http.open_timeout = http.read_timeout = my_timeout
res = http.get(url.path)
status = res.code
if status.to_i == 200
f = File.open(out_file, 'w')
f.print res.body
f.close()
exit 0
else
exit 255
end
http = Net::HTTP.new(url.host, url.port)
http.open_timeout = http.read_timeout = my_timeout
res = http.get(url.path)
status = res.code
if status.to_i == 200
f = File.open(out_file, 'w')
f.print res.body
f.close()
exit 0 # start() with a block means that we exit with
status or else it’s 0 all the time.
else
exit 255
end
} # end pfm.start { … }
rescue Exception => e
print "Arggh, exception: ", e, “\n”
exit 255
end
}
pfm.wait_all_children()
end
Would somebody be able to shed some light on why I’m unable to handle
the exceptions that Net::HTTP is throwing? Am I just trying to catch
the wrong exception or is there something else? Is it because I’m
trying to handle the exception in the child?
Note that this is: ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
under Ubuntu 9.10 (Karmic).