[Pvfs2-developers] Trove segmentation fault -

Kevin Harms harms at alcf.anl.gov
Wed May 27 15:48:31 EDT 2009


David,

   I added this to Trac. https://trac.mcs.anl.gov/projects/pvfs/ticket/106

kevin

On May 26, 2009, at 3:14 PM, David Bonnie wrote:

> Hey everyone -
>
> Nick and I are digging a little bit into trove and have found a bit of
> a bug.  When trove debugging is enabled (by way of the config file
> "trove" flag) the server will crash under I/O calls (namely pvfs2-cp).
>  It sometimes runs for a few seconds before crashing but it's
> consistent enough to seg fault every time I try to transfer a 256 MB
> file onto or off of the server.  I tested on both 32 bit RHEL5 and 64
> bit Fedora 10, release 2.8.1 on both.  Only one server was running and
> it was acting as both a metadata and an I/O server.
>
> I believe it's something to do with threading since it happens when
> printing out a status message (I'm fairly certain the call to
> gossip_debug() on line 327 of dbpf-bstream.c is the culprit).  Here is
> the last bit of the log file and the stack trace from gdb on RHEL5 32
> bit:
>
>
> [D 05/26 13:39] aio_progress_notification: BSTREAM_READ_LIST complete:
> aio_return() says 262144 [fd = 11]
> [D 05/26 13:39] *** starting delayed ops if any (state is  
> LIST_PROC_ALLPOSTED)
> [D 05/26 13:39] DBPF I/O ops in progress: 1
> [New Thread 0xb56a0b90 (LWP 1272)]
> [Thread 0xb2cfeb90 (LWP 1271) exited]
> [D 05/26 13:39] issue_or_delay_io_operation: lio_listio posted
> 0xa0d0ec8 (handle 9223372036854775805, ret 0)
> [D 05/26 13:39]  --- aio_progress_notification called with handle
> 9223372036854775805 (0xa0d0ec8)
> [D 05/26 13:39] aio_progress_notification: BSTREAM_READ_LIST complete:
> aio_return() says 262144 [fd = 11]
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xb56a0b90 (LWP 1272)]
> 0x00c04993 in strlen () from /lib/libc.so.6
> (gdb) bt
> #0  0x00c04993 in strlen () from /lib/libc.so.6
> #1  0x00bd4bce in vfprintf () from /lib/libc.so.6
> #2  0x00bf53b4 in vsnprintf () from /lib/libc.so.6
> #3  0x08059bed in gossip_debug_fp_va (fp=0xb569fb5c,
>    prefix=<value optimized out>,
>    format=0xb569fc80 "*** starting delayed ops if any (state is ST
> complete: aio_return() says 262144 [fd = 11]\n", ap=0xb56a00d0 "t:
> hpz\016\b", ts=13455348)
>    at src/common/gossip/gossip.c:506
> #4  0x0805a041 in __gossip_debug (mask=65536, prefix=63 '?',
>    format=0x80dc3b0 "*** starting delayed ops if any (state is %s)\n")
>    at src/common/gossip/gossip.c:281
> #5  0x080a9ed9 in aio_progress_notification (sig=
>      {sival_int = 168627912, sival_ptr = 0xa0d0ec8})
>    at src/io/trove/trove-dbpf/dbpf-bstream.c:237
> #6  0x080ba89c in alt_lio_thread (foo=0xa0d0ce8)
>    at src/io/trove/trove-dbpf/dbpf-alt-aio.c:275
> #7  0x00d0f49b in start_thread () from /lib/libpthread.so.0
> #8  0x00c6642e in clone () from /lib/libc.so.6
>
>
>
> Thanks,
> - Dave
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list