[Pvfs2-developers] pvfs2-server failures openib

Sam Lang slang at mcs.anl.gov
Wed Apr 4 12:32:35 EDT 2007


Right now that backtrace isn't very useful.  I think the backtrace on  
segfault will print source files and lines if you compile with  
debugging.  Configure with --enable-strict or build with CFLAGS=-g.

It looks like the segfault might not be directly related to the bmi  
ib error that you're getting, but more that we're not correctly  
handling the error returned.

-sam

On Apr 4, 2007, at 12:21 PM, Kyle Schochenmaier wrote:

>
> I can reproducibly trigger this error on the server by doing  
> multiple instances of pvfs2-cp over various IB hardware.
>
> For this one, I did:
>
> pvfs2-cp -t /pvfs2/1node/test2 /dev/null & pvfs2-cp -t /pvfs2/1node/ 
> test2 /dev/null & pvfs2-cp -t /pvfs2/1node/test2 /dev/null & pvfs2- 
> cp -t /pvfs2/1node/test2 /dev/null & pvfs2-cp -t /pvfs2/1node/ 
> test2 /dev/null & pvfs2-cp -t /pvfs2/1node/test2 /dev/null
>
> That should be 6 of them, everything worked fine up until 6  
> processes started hammering the server.  I can reproduce this with  
> only 3 processes using faster/lower-latency hardware on the client.
>
> Any ideas where to start tracking this one down?
>
> Kyle
>
>
>
> [D 21:25:03.375378] PVFS2 Server version  
> 2.6.2pre1-2007-02-23-150254 starting.
> [E 17:00:09.488945] job_time_mgr_expire: job time out: cancelling  
> flow operation
> , job_id: 4608742.
> [E 17:00:09.571458] fp_multiqueue_cancel: flow proto cancel called  
> on 0x60a620
> [E 17:00:09.572098] handle_io_error: flow proto error cleanup  
> started on 0x60a62
> 0, error_code: -1610612737
> [E 17:00:09.573040] handle_io_error: flow proto 0x60a620 canceled 8  
> operations,
> will clean up.
> [E 17:00:09.573595] handle_io_error: flow proto 0x60a620 error  
> cleanup finished,
> error_code: -1610612737
> [E 17:00:09.573630] Error: memcache_memfree: buf 0x2aaaab272010 len  
> 262144 count
> = 2, expected 1.
> [E 17:00:09.630067]     [bt] /usr/local/sbin/pvfs2-server(error 
> +0xca) [0x42e40a]
> [E 17:00:09.630112]     [bt] /usr/local/sbin/pvfs2-server [0x42e72a]
> [E 17:00:09.630120]     [bt] /usr/local/sbin/pvfs2-server [0x431fa2]
> [E 17:00:09.630128]     [bt] /usr/local/sbin/pvfs2-server [0x433a69]
> [E 17:00:09.630136]     [bt] /usr/local/sbin/pvfs2-server [0x43df38]
> [E 17:00:09.630143]     [bt] /lib/libpthread.so.0 [0x2b3168be0b1c]
> [E 17:00:09.630151]     [bt] /lib/libc.so.6(__clone+0x72)  
> [0x2b3168fcd9c2]
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list