Segfaults with ffi in multi threaded env

We have a kind of unusual setup I’m trying to debug where so far it
looks like the problem is on the jruby/java side of things.

I’m using ffi to load tamarin
(http://www.mozilla.org/projects/tamarin/) as a shared library, then
running the tamarin vm which starts up several threads that load swf
files, and then use zeromq to message between the two vm’s.

When starting up multiple threads on the jruby side and having each
thread doing send/receive non stop, it segfaults within 20-30 seconds.
Just one thread and it’s fine.

The best backtrace I’ve been able to get is below. There are a lot of
unresolved symbols there, but maybe it might help show where to look
next? Tamarin, openjdk, and libzmq are built with debugging symbols
but libjffi is not, I suspect if I got a copy of libjffi with symbols
it might show more.

Chris


Program received signal SIGABRT, Aborted.
0x00007fc92c730a75 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007fc92c730a75 in raise () from /lib/libc.so.6
#1 0x00007fc92c7345c0 in abort () from /lib/libc.so.6
#2 0x00007fc92c190a59 in os::abort (dump_core=true) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:1495
#3 0x00007fc92c2d19c1 in VMError::report_and_die
(this=0x7fc913ff4bd0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:898
#4 0x00007fc92beddc96 in report_vm_error (file=0x7fc92c32b178
“/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp”,
line=809,
error_msg=0x7fc92c307f8f “Unimplemented()”, detail_msg=0x0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/utilities/debug.cpp:176
#5 0x00007fc92c1a8283 in ParallelScavengeHeap::block_start
(this=, addr=0xfae6d1a8)
at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/gc_implementation/parallelScavenge/parallelScavengeHeap.cpp:809
#6 0x00007fc92c189db0 in os::print_location (st=0x7fc913ff4e20,
x=4209430952, print_pc=false) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/runtime/os.cpp:804
#7 0x00007fc92c194319 in os::print_context (st=0x7fc913ff4e20,
context=0x7fc913ff5000) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:796
#8 0x00007fc92c2d0b2c in VMError::report (this=0x7fc913ff4f60,
st=0x7fc913ff4e20) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:461
#9 0x00007fc92c2d160b in VMError::report_and_die
(this=0x7fc913ff4f60) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:843
#10 0x00007fc92c194703 in JVM_handle_linux_signal (sig=11,
info=0x7fc913ff5130, ucVoid=0x7fc913ff5000,
abort_if_unrecognized=)
at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:494
#11
#12 0x00007fc92c02d80a in os::write_memory_serialize_page (env=, handle=0x7fc913ff54b0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/runtime/os.hpp:301
#13 InterfaceSupport::serialize_memory (env=,
handle=0x7fc913ff54b0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/os/linux/vm/interfaceSupport_linux.hpp:28
#14 ThreadStateTransition::transition_from_native (env=, handle=0x7fc913ff54b0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/runtime/interfaceSupport.hpp:181
#15 ThreadStateTransition::trans_from_native (env=, handle=0x7fc913ff54b0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/runtime/interfaceSupport.hpp:200
#16 ThreadInVMfromNative (env=,
handle=0x7fc913ff54b0) at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/runtime/interfaceSupport.hpp:247
#17 JVM_IHashCode (env=, handle=0x7fc913ff54b0)
at
/build/buildd/openjdk-6-6b20-1.9.7/build/openjdk/hotspot/src/share/vm/prims/jvm.cpp:455
#18 0x00007fc9275b9fdf in ?? ()
#19 0x00000000fae6d080 in ?? ()
#20 0x0000000800000016 in ?? ()
#21 0x00000000fae699c8 in ?? ()
#22 0x00000000e15ed840 in ?? ()
#23 0xe11efd00e1597808 in ?? ()
#24 0x00000000fae6af80 in ?? ()
#25 0x00000000d66343a8 in ?? ()
#26 0x00000000fae6b0c0 in ?? ()
#27 0x00000000e15ed480 in ?? ()
#28 0x00007fc9275d5820 in ?? ()
#29 0x00000000fae6d080 in ?? ()
#30 0xe1210b68e138d578 in ?? ()
#31 0x00000000ffffffff in ?? ()
#32 0x00000000fae6d0b0 in ?? ()
#33 0x00000000fae6d100 in ?? ()
#34 0x00000000fae6d180 in ?? ()
#35 0x00000000fae6d1a8 in ?? ()
#36 0x00000000e15ed480 in ?? ()
#37 0x00000000fae6d080 in ?? ()
#38 0x00000000e138d578 in ?? ()
#39 0x00000000fae6b068 in ?? ()
#40 0x00000000fae6afc8 in ?? ()
#41 0x00000000e138d578 in ?? ()
#42 0x00000000e138d578 in ?? ()
#43 0x00000000fae6b0a0 in ?? ()
#44 0x00000000fae6c370 in ?? ()
#45 0x00000000e15ed480 in ?? ()
#46 0x00000000e1597808 in ?? ()
#47 0x00000000fae6d0b0 in ?? ()
#48 0x00007fc9276df950 in ?? ()
#49 0x00000000e15ed480 in ?? ()
#50 0x00000000fae6d080 in ?? ()
#51 0x00000000e1437a38 in ?? ()
#52 0x00007fc92763eb38 in ?? ()
#53 0x00000000e15ed480 in ?? ()
#54 0x00000006d79aaee8 in ?? ()
#55 0x00000000e1596a48 in ?? ()
#56 0x00007fc9276f25b0 in ?? ()
#57 0x00000000e1596748 in ?? ()
#58 0x00000000e15ed480 in ?? ()
#59 0x00000000fae6af58 in ?? ()
#60 0x00000000e15ed480 in ?? ()
#61 0x00000000fae6af80 in ?? ()
#62 0x00000000fae6b068 in ?? ()
#63 0x00000000fae6d080 in ?? ()
#64 0x00000000fae64cf8 in ?? ()
#65 0x00000000e1597808 in ?? ()
#66 0x00000000fae6afc8 in ?? ()
#67 0xe138d578e138d578 in ?? ()
#68 0x00000000fae6adf8 in ?? ()
#69 0xe1596a48e15969f8 in ?? ()
#70 0x00000000fae6af08 in ?? ()
#71 0x0000000000000007 in ?? ()
#72 0x00007fc9275e2950 in ?? ()
#73 0x00000000e15ed480 in ?? ()
#74 0x00007fc9276dcedc in ?? ()
#75 0x00000000e1437a38 in ?? ()
#76 0x00007fc927708ddc in ?? ()
#77 0x00000000e1596748 in ?? ()
#78 0x00000000e15ed480 in ?? ()
#79 0x00000000e1597808 in ?? ()
#80 0x00000000e12115e0 in ?? ()
#81 0xe1393208e138d578 in ?? ()
#82 0x00000000e1593ff0 in ?? ()
#83 0x00000000fae64cf8 in ?? ()
#84 0x0000000000000004 in ?? ()
#85 0x00000000e11f0d40 in ?? ()
#86 0x00007fc927662660 in ?? ()
#87 0x00000000e15062a8 in ?? ()
#88 0x00000000fae668e8 in ?? ()
#89 0x00000000fae67168 in ?? ()
#90 0x00000000e1596700 in ?? ()
#91 0x00000000e1597808 in ?? ()
#92 0x00000000e17fc390 in ?? ()
#93 0x00000000e1437a38 in ?? ()
#94 0x00000000e15ed480 in ?? ()
#95 0x02e60ea6e138d578 in ?? ()
#96 0x00000000e138d578 in ?? ()
#97 0x00000000fae66e90 in ?? ()
#98 0x00007fc9276dd9e4 in ?? ()
#99 0x00000000e15ed480 in ?? ()
#100 0x00007fc927705e78 in ?? ()
#101 0xe1437a38e17fc3b0 in ?? ()
#102 0x00000000e17fc378 in ?? ()
#103 0x00000000fae66e78 in ?? ()
#104 0x00000000e1597808 in ?? ()
#105 0x00000000fae67168 in ?? ()
#106 0x00000000e1154ad0 in ?? ()
#107 0x00000000fae6add0 in ?? ()
#108 0x00000000fae66e90 in ?? ()
#109 0x00000000e17fc5b0 in ?? ()
#110 0x00000000fae671e8 in ?? ()
#111 0x00000000fae66e58 in ?? ()
#112 0x00007fc925646b0a in
Java_com_kenai_jffi_Foreign_invokeArrayReturnInt () from
/marvel/jruby-1.6.1/lib/native/x86_64-Linux/libjffi-1.0.so
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

On Jun 3, 2011, at 2:17 PM, snacktime wrote:

Just one thread and it’s fine.

The best backtrace I’ve been able to get is below. There are a lot of
unresolved symbols there, but maybe it might help show where to look
next? Tamarin, openjdk, and libzmq are built with debugging symbols
but libjffi is not, I suspect if I got a copy of libjffi with symbols
it might show more.

Make sure that you are only using the 0mq socket(s) from the thread in
which they were created. If you are using one socket from multiple
threads, you need to protect it with a memory barrier (e.g. mutex).

cr

On Fri, Jun 3, 2011 at 12:49 PM, Chuck R. [email protected]
wrote:

Make sure that you are only using the 0mq socket(s) from the thread in which
they were created. If you are using one socket from multiple threads, you need to
protect it with a memory barrier (e.g. mutex).
Ya I’ve looked at the code repeatedly to see if I might be doing that
without realizing it, but if I was doing that I suspect it would blow
up in a very different way , and something would be pointing at zeromq
in the backtrace, which it doesn’t appear to be.

Chris

Anyone know how to get the source for libjffi? jruby source seems to
come with a precompiled version.

Chris

Figured out how to compile the jruby c extensions with debugging, was
fairly easy to solve after that, was in my own native C code.

Chris