[Pvfs2-developers] Re: PVFS2/InfiniBand hangs
Kyle Schochenmaier
kschoche at scl.ameslab.gov
Wed May 10 13:16:17 EDT 2006
Pete Wyckoff wrote:
>kschoche at scl.ameslab.gov wrote on Tue, 09 May 2006 16:45 -0500:
>
>
>>We've been consistently running into hangs during our single-client
>>tests of InfiniBand over pvfs2, we've gotten it down to being
>>reproducable by trying/doing tabbed-completion of the pvfs2
>>file-system(using the kernel/pvfs-client interface), or by doing 'ls' on
>>the file-system (though the later is less reproducable) It appears that
>>the server or client goes into a waiting loop somehow and stalls, while
>>from the server side, it appears that the client may have unexpectedly
>>disconnected, or possibly the client 'finished' its request. Meanwhile,
>>the client waits with 100% cpu usage, and the server sits there @ 0% cpu
>>usage until someone pokes the server, i.e. another client makes some
>>request to the server, at which point the client wakes up and everything
>>continues as if the stall had not occured.
>>
>>This log is the client-side log from running a pvfs-client over the
>>kernel-module/interface to the file-system, we had the debugging flags
>>set to PVFS2_DEBUGMASK=server,network,client. The beginning is from a
>>test of untarring a kernel tarball, which stalls before completion (I've
>>*'d the spot where it hangs)
>>
>>After about 3 minutes, we [D 15:45:...] We issued a 'pvfs2-ls' from
>>another machine, and the box "un-hung" itself. During the period when
>>the client was 'hung' the client had 100% cpu utilization occuring from
>>the client-core process, and the servers had no pvfs2-related activity
>>(0% cpu from pvfs-server).
>>
>>
>[..]
>
>
>>[D 15:42:52.493905] PVFS_isys_readdir entered
>>
>>
>
>And readdir completed fine.
>
>
>
>>[D 15:42:56.908679] PVFS_isys_getattr entered
>>[D 15:42:56.908723] (0x1013c950) getattr_setup_msgpair
>>[D 15:42:56.908855] BMI_post_recv: addr: 155, offset: 0x10429ea0, size: 9280
>>[D 15:42:56.908877] BMI_ib_post_recv: expected len 9280 tag 6.
>>[D 15:42:56.908899] generic_post_recv: new rq 0x101061f0.
>>[D 15:42:56.908928] BMI_post_sendunexpected_list: addr: 155, count: 1,
>>total_size: 40
>>[D 15:42:56.908947] element 0: offset: 0x100d3600, size: 40
>>[D 15:42:56.908967] BMI_ib_post_sendunexpected_list: listlen 1 tag 6.
>>[D 15:42:56.908988] generic_post_send: new sq 0x100f9150.
>>[D 15:42:56.909007] encourage_send_waiting_buffer: sq 0x100f9150.
>>[D 15:42:56.909029] post_rr_ack: da6:3332 bh 5.
>>[D 15:42:56.909049] post_sr: da6:3332 bh 5 len 56 wr 10/1023.
>>[D 15:42:56.909070] encourage_send_waiting_buffer: sq 0x100f9150 sent
>>EAGER now SQ_WAITING_EAGER_ACK.
>>[D 15:42:56.909088] Posted PVFS_SYS_GETATTR (waiting for test)
>>[D 15:42:56.909112] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
>>[D 15:42:56.909591] check_cq: ack message da6:3332 my bufnum 5.
>>[D 15:42:56.909610] check_cq: sq 0x100f9150 SQ_WAITING_EAGER_ACK ->
>>SQ_WAITING_USER_TEST.
>>[D 15:42:56.909631] test_sq: sq 0x100f9150 completed 40 to da6.
>>*[D 15:42:56.909650] BMI_testcontext completing: 207
>>
>>
>
>Getattr: client posts recv then sends the request. Send completed
>fine. Waiting on recv (spinning at 100%, nothing better to do).
>
>
>
>>*[D 15:45:16.050430] check_cq: found len 88 at da6:3332 my bufnum 5 type
>>MSG_EAGER_SEND.
>>[D 15:45:16.050466] encourage_recv_incoming: recv eager my bufnum 5 his
>>bufnum 5 len 88.
>>
>>
>
>Somehow you bumped the server to send a response. No idea why it
>waited until now to respond. A server-side trace would help.
>
>
>
I bumped the server by running pvfs-ls on a third box ;)
>This would be much easier to debug if you could break things without
>the kernel interface. I'm desperately looking for debuggable
>failure modes, and will see if I can reproduce yours with or without
>the kernel.
> -- Pete
>
>
>
>
As for non-kernel interface breakage, we had extensive problems with
this when
doing our testing of the NetPipe module that we built for pvfs2. The
catch is that it seems the bug is somewhat
random when found via this method. We havent yet seen it fail at the
same place(same iteration,same size) over multiple runs. But we did
find that we get a hang at least once per run. It was modeled using the
semantics from pvfs-cp so it does not require the kernel interface, so I
think this may be a good way to get moderately reproducable non-kernel
interface effects.
I will work on getting a server-side trace out today, I think I have a
log (server+client)from yesterday where it stalled on a tab-completion
attempt, though not from this same run.
-Kyle
--
Kyle Schochenmaier
kschoche at scl.ameslab.gov
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory
More information about the Pvfs2-developers
mailing list