In a program with two DRb servers running (two time start_service), i
get the following deadlock after a while of running with a client
connecting to both servers:
How can I debug this issue? I don’t understand why it is a deadlock at
all, since drb.rb:944 is a call to Socket#accept, which does not
depend purely on other Ruby threads.
How can I debug this issue? I don’t understand why it is a deadlock at
all, since drb.rb:944 is a call to Socket#accept, which does not
depend purely on other Ruby threads.
Any ideas?
For a deadlock you need at least two resources that are locked in
different order. Maybe you have synchronized calls across the two
servers that deadlock.
You could use set_trace_func to trace program execution until the
deadlock and look at the execution flow.
In a program with two DRb servers running (two time start_service), i
Why do you have two servers?
Well… legacy. I have converted my application to having only 1 DRb
service started, but the same problem occurs. I still get a deadlock
after the clients have been connecting for a while.
Too bad.
For adeadlockyou need at least two resources that are locked in
different order. Maybe you have synchronized calls across the two
servers thatdeadlock.
My main thread is blocked by DRb.thread.join. All other threads are
inside the DRb library on either Socket#accept, #read or #write.
And, are there any locks held?
How can there be a deadlock if a thread is waiting in a Socket#accept
call? As I understand the Ruby deadlock detection is simply fires when
there is no thread to run.
You could use set_trace_func to trace program execution until thedeadlockand look at the execution flow.
I have tried this, but it doesn’t show anything other that the
deadlock report from Ruby, i.e. that the threads are calling
Socket#accept, #read or #write and Thread#join.
These issues are next to impossible to debug without access to code
and an understanding of what the app really does. I’m afraid, I can’t
help you further right now.
service started, but the same problem occurs. I still get a deadlock
I have tried this, but it doesn’t show anything other that the
deadlock report from Ruby, i.e. that the threads are calling
Socket#accept, #read or #write and Thread#join.
These issues are next to impossible to debug without access to code
and an understanding of what the app really does. I’m afraid, I can’t
help you further right now.
Kind regards
robert
What version and patch level of ruby do you have? If you have ruby
1.8.6 and the patch level is less than p111 then you have a faulty
ruby interpreter with broken threading that can cause these deadlocks.
Make sure you are using ruby 1.8.5 or ruby 1.8.6p11 minimum.
In a program with two DRb servers running (two time start_service), i
Why do you have two servers?
Well… legacy. I have converted my application to having only 1 DRb
service started, but the same problem occurs. I still get a deadlock
after the clients have been connecting for a while.
For adeadlockyou need at least two resources that are locked in
different order. Maybe you have synchronized calls across the two
servers thatdeadlock.
My main thread is blocked by DRb.thread.join. All other threads are
inside the DRb library on either Socket#accept, #read or #write.
How can there be a deadlock if a thread is waiting in a Socket#accept
call? As I understand the Ruby deadlock detection is simply fires when
there is no thread to run.
You could use set_trace_func to trace program execution until thedeadlockand look at the execution flow.
I have tried this, but it doesn’t show anything other that the
deadlock report from Ruby, i.e. that the threads are calling
Socket#accept, #read or #write and Thread#join.
What version and patch level of ruby do you have? If you have ruby
1.8.6 and the patch level is less than p111 then you have a faulty
ruby interpreter with broken threading that can cause these deadlocks.
Make sure you are using ruby 1.8.5 or ruby 1.8.6p11 minimum.
Had the same problem with 1.8.6p111. I finally tracked down the
problem to a bug in Process.create from the ‘win32-process’ gem. Some
code added to this function afterversion 0.5.5 would call CloseHandle
on something that was not a handle but a process or thread ID. When
these are the same as socket handles, etc, the process would sometimes
deadlock, sometimes simply close a listening socket, fail in
Socket#accept, or go into infinte loops.