[Pvfs2-users] Crashing pvfs2-server
Robert Latham
robl at mcs.anl.gov
Mon Feb 18 17:33:19 EST 2008
On Thu, Feb 14, 2008 at 04:33:29PM -0500, Jon M Burgoyne wrote:
> Wondering if anyone has run across this yet (I'm new to the list). Am running
> pvfs2 server v 2.7.0 on 3 meta/io servers which are RHEL5
> 2.6.18-53.1.6.el5. These are dual/dual opteron machines with 8G ram per.
> The clients are running the same version, only built on RHEL4 kernel. The
> filesystem is very large (3 x 8TB ext3s) with a fs.conf file:
Hi
This configuration *should* work. About a month ago we had a round of
emails diagnosing some problems with RHEL 4, but that was a differnt
problem -- the kernel module would not get built correctly.
> I have a user that consistently crashes one of the servers (seems random).
> After enabling segv-backtrace, I get the following message:
>
> [D 02/14 15:48] PVFS2 Server version 2.7.0 starting.
> [E 02/14 15:55] PVFS2 server: signal 11, faulty address is 0x18, from 0x3cb366ee
> f3
> [E 02/14 15:55] [bt] /lib64/libc.so.6 [0x3cb366eef3]
> [E 02/14 15:55] [bt] /lib64/libc.so.6 [0x3cb366eef3]
> [E 02/14 15:55] [bt] /lib64/libc.so.6(cfree+0x8c) [0x3cb3672b1c]
> [E 02/14 15:55] [bt] /usr/sbin/pvfs2-server(job_testcontext+0x13b) [0x432ecb]
> [E 02/14 15:55] [bt] /usr/sbin/pvfs2-server(main+0xdc8) [0x4109f8]
> [E 02/14 15:55] [bt] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3cb361d8a4]
> [E 02/14 15:55] [bt] /usr/sbin/pvfs2-server [0x40e369]
>
> Does anyone have a clue about this one?
This backtrace is interesting, but without debugging information it's
hard to say what's going on here.
Can you rebuild the pvfs servers with debugging information? (Add
'-g' to CFLAGS and re-run configure). A signal 11 should also end up
easy to diagnose with valgrind.
Thanks
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B
More information about the Pvfs2-users
mailing list