Comment 6 for bug 123979

Revision history for this message
In , Twiest (twiest) wrote :

---- Reported by <email address hidden> 2006-09-20 22:16:57 MST ----

Please fill in this template when reporting a bug, unless you know what you
are doing.
Description of Problem: we've seen several cases where our C# program has
finished the last line of main() but we never get back to the Linux prompt
as mono appears to be hung. When it's in this state, we can't run any
other mono programs until the "hanging" one is killed. Is it possible that
the mono runtime has a lock somewhere which prevents other instances from
running it and when one runtime instance is holding the lock forever??

Steps to reproduce the problem:
we haven't yet been able to consistently reproduce this, but we've seen it
several time. killing the one that is hung frees everthing up.

We'll keep trying, but thought it was worth entering the bug. Perhaps
there is something we can do to troubleshoot it when it gets in this state.

Actual Results:

Expected Results:

How often does this happen? every once in a while

Additional Information:

---- Additional Comments From <email address hidden> 2006-09-21 11:03:43 MST ----

This looks like an issue with the io-layer.
When it happens again, please attach gdb to the hanging mono process
and get a backtrace of all the threads (type 'thread apply all bt' at
the gdb prompt). Also, what preecise version of mono are you using
(mono --version)?

---- Additional Comments From <email address hidden> 2006-09-21 13:06:37 MST ----

Will do (the gdb bit). We're using 1.1.13.8.

---- Additional Comments From <email address hidden> 2006-09-26 13:21:41 MST ----

I encountered the hang again. There are 2 mono processes. The first
yields:

(gdb) thread apply all bt

(gdb) bt

#0 0x0038a410 in ?? ()

#1 0xbfb79858 in ?? ()

#2 0x00000000 in ?? ()

The 2nd process seems to have more info:

(gdb) thread apply all bt

Thread 5 (Thread -1213420640 (LWP 14556)):

#0 0x0038a410 in ?? ()

#1 0xb7aca388 in ?? ()

#2 0x00000001 in ?? ()

#3 0x00ad8011 in ?? ()

#4 0x0047236d in semop () from /lib/libc.so.6

#5 0x080fe496 in _wapi_shm_sem_lock (sem=-4) at shared.c:483

#6 0x081042de in _wapi_handle_update_refs ()

    at ../../mono/io-layer/handles-private.h:319

#7 0x080fe5f5 in collection_thread (unused=0x0) at collection.c:37

#8 0x007b040b in start_thread () from /lib/libpthread.so.0

#9 0x00470b7e in clone () from /lib/libc.so.6

Thread 4 (Thread -1218552928 (LWP 14557)):

#0 0x0038a410 in ?? ()

#1 0xb75e5270 in ?? ()

#2 0x0000866b in ?? ()

#3 0x00000000 in ?? ()

Thread 3 (Thread -1222575200 (LWP 14559)):

#0 0x0038a410 in ?? ()

#1 0xb720ef94 in ?? ()

---Type <return> to continue, or q <return> to quit---

#2 0x00008b93 in ?? ()

#3 0x00000000 in ?? ()

Thread 2 (Thread -1225593952 (LWP 14560)):

#0 0x0038a410 in ?? ()

#1 0xb6f2e458 in ?? ()

#2 0xb6f2e338 in ?? ()

#3 0xb6f2e3b8 in ?? ()

#4 0x00469e11 in ___newselect_nocancel () from /lib/libc.so.6

#5 0xb6fca3cc in Tcl_InitNotifier () from /usr/lib/libtcl8.4.so

#6 0x007b040b in start_thread () from /lib/libpthread.so.0

#7 0x00470b7e in clone () from /lib/libc.so.6

Thread 1 (Thread -1208580416 (LWP 14555)):

#0 0x0038a410 in ?? ()

#1 0xbff1ab40 in ?? ()

#2 0x081be970 in __JCR_LIST__ ()

#3 0xbff1ab18 in ?? ()

#4 0x007b60d8 in recvfrom () from /lib/libpthread.so.0

#5 0x080f72de in _wapi_recvfrom (fd=9, buf=0x8458560, len=1024,
recv_flags=0,

    from=0x0, fromlen=0x0) at sockets.c:498

#6 0x080f7234 in _wapi_recv (fd=4294966784, buf=0xfffffe00,
len=4294966784,

    recv_flags=-512) at sockets.c:478

---Type <return> to continue, or q <return> to quit---

#7 0x080baa47 in ves_icall_System_Net_Sockets_Socket_Receive_internal (

    sock=4294966784, buffer=0xbff1ab18, offset=136046960, count=1024,

    flags=-512, error=0xbff1ac00) at socket-io.c:1212

#8 0xb721eedb in ?? ()

#9 0x00000009 in ?? ()

#10 0x08458550 in ?? ()

#11 0x00000000 in ?? ()

(gdb)

---- Additional Comments From <email address hidden> 2006-10-06 14:00:20 MST ----

Created an attachment (id=170526)
c# code

---- Additional Comments From <email address hidden> 2006-10-06 14:00:46 MST ----

Created an attachment (id=170527)
Makefile

---- Additional Comments From <email address hidden> 2006-10-06 14:01:31 MST ----

Seems to happen if we leave threads running before exiting main and
threads are actively running and using AutoResetEvents,
ManualResetEvents, etc...

---- Additional Comments From <email address hidden> 2006-11-13 18:07:41 MST ----

I've been unable to reproduce this with current svn. I've tested on
several machines, to see if variations in cpu speed showed a race
condition.

On which version of mono exactly are you seeing the bug?

If it happens again, running "mono --wapi=seminfo" should show the PID
of the mono process that is holding the semaphore locked. Getting a
backtrace of that process should show what's causing it. (If your
mono installation is too old to have that option, it's also older than
several fixed bugs of this variety.)

---- Additional Comments From <email address hidden> 2007-01-01 16:55:39 MST ----

Setting bug to NEEDINFO.

Imported an attachment (id=170526)
Imported an attachment (id=170527)

Unknown bug field "cf_op_sys_details" encountered while moving bug
   <cf_op_sys_details>Red Hat Enterprise Linux WS release 4</cf_op_sys_details>