[Pvfs2-developers] PVFS2/Infiniband hangs

Kyle Schochenmaier kschoche at scl.ameslab.gov
Tue May 9 18:00:29 EDT 2006


We've been consistently running into hangs during our single-client 
tests of InfiniBand over pvfs2, we've gotten it down to being 
reproducable by trying/doing tabbed-completion of the pvfs2 
file-system(using the kernel/pvfs-client interface), or by doing 'ls' on 
the file-system (though the later is less reproducable)  It appears that 
the server or client goes into a waiting loop somehow and stalls, while 
from the server side, it appears that the client may have unexpectedly 
disconnected, or possibly the client 'finished' its request.  Meanwhile, 
the client waits with 100% cpu usage, and the server sits there @ 0% cpu 
usage until someone pokes the server, i.e. another client makes some 
request to the server, at which point the client wakes up and everything 
continues as if the stall had not occured.

This log is the client-side log from running a pvfs-client over the 
kernel-module/interface to the file-system, we had the debugging flags 
set to PVFS2_DEBUGMASK=server,network,client.  The beginning is from a 
test of untarring a kernel tarball, which stalls before completion (I've 
*'d the spot where it hangs)

After about 3 minutes, we [D 15:45:...] We issued a 'pvfs2-ls' from 
another machine, and the box "un-hung" itself.  During the period when 
the client was 'hung' the client had 100% cpu utilization occuring from 
the client-core process, and the servers had no pvfs2-related activity 
(0% cpu from pvfs-server).

Is there anything this log can tell us-hopefully what state we are in 
when it hangs, etc?
Thanks,

Kyle

LOG:
<snipped the beginning>
[D 15:42:52.491387] readdir state: init
[D 15:42:52.491407] (0x10122930) getattr state: getattr_cleanup
[D 15:42:52.491428] readdir state: readdir_msg_setup_msgpair
[D 15:42:52.492010] BMI_post_recv: addr: 155, offset: 0x400020e7010, 
size: 135720
[D 15:42:52.492033] BMI_ib_post_recv: expected len 135720 tag 4.
[D 15:42:52.492055] generic_post_recv: new rq 0x101061f0.
[D 15:42:52.492084] BMI_post_sendunexpected_list: addr: 155, count: 1, 
total_size: 44
[D 15:42:52.492103]    element 0: offset: 0x100d3600, size: 44
[D 15:42:52.492123] BMI_ib_post_sendunexpected_list: listlen 1 tag 4.
[D 15:42:52.492143] generic_post_send: new sq 0x100f9150.
[D 15:42:52.492161] encourage_send_waiting_buffer: sq 0x100f9150.
[D 15:42:52.492182] post_rr_ack: da6:3332 bh 3.
[D 15:42:52.492202] post_sr: da6:3332 bh 3 len 60 wr 6/1023.
[D 15:42:52.492222] encourage_send_waiting_buffer: sq 0x100f9150 sent 
EAGER now SQ_WAITING_EAGER_ACK.
[D 15:42:52.492240] Posted PVFS_SYS_READDIR (waiting for test)
[D 15:42:52.492265] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
[D 15:42:52.492285] check_cq: ack message da6:3332 my bufnum 3.
[D 15:42:52.492304] check_cq: sq 0x100f9150 SQ_WAITING_EAGER_ACK -> 
SQ_WAITING_USER_TEST.
[D 15:42:52.492324] test_sq: sq 0x100f9150 completed 44 to da6.
[D 15:42:52.492344] BMI_testcontext completing: 189
[D 15:42:52.492494] check_cq: found len 248 at da6:3332 my bufnum 3 type 
MSG_EAGER_SEND.
[D 15:42:52.492515] encourage_recv_incoming: recv eager my bufnum 3 his 
bufnum 3 len 248.
[D 15:42:52.492534] post_rr: da6:3332 bh 3.
[D 15:42:52.492554] post_sr_ack: da6:3332 bh 3 wr 7/1023.
[D 15:42:52.492573] encourage_recv_incoming: matched rq 0x101061f0 now 
RQ_EAGER_WAITING_USER_TEST.
[D 15:42:52.492592] encourage_recv_incoming: early registration not 
needed, dereg after eager.
[D 15:42:52.492611] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
[D 15:42:52.492631] test_rq: rq 0x101061f0 completed 232 from da6.
[D 15:42:52.492651] BMI_testcontext completing: 187
[D 15:42:52.492675] readdir_msg_comp_fn
[D 15:42:52.493840] readdir state: cleanup
[D 15:42:52.493905] PVFS_isys_readdir entered
[D 15:42:52.493929] readdir state: init
[D 15:42:52.493949] (0x1012f940) getattr state: getattr_cleanup
[D 15:42:52.493968] readdir state: readdir_msg_setup_msgpair
[D 15:42:52.494472] BMI_post_recv: addr: 155, offset: 0x400020e7010, 
size: 135720
[D 15:42:52.494496] BMI_ib_post_recv: expected len 135720 tag 5.
[D 15:42:52.494522] generic_post_recv: new rq 0x101061f0.
[D 15:42:52.494550] BMI_post_sendunexpected_list: addr: 155, count: 1, 
total_size: 44
[D 15:42:52.494569]    element 0: offset: 0x100d3600, size: 44
[D 15:42:52.494588] BMI_ib_post_sendunexpected_list: listlen 1 tag 5.
[D 15:42:52.494608] generic_post_send: new sq 0x100f9150.
[D 15:42:52.494626] encourage_send_waiting_buffer: sq 0x100f9150.
[D 15:42:52.494647] post_rr_ack: da6:3332 bh 4.
[D 15:42:52.494668] post_sr: da6:3332 bh 4 len 60 wr 8/1023.
[D 15:42:52.494688] encourage_send_waiting_buffer: sq 0x100f9150 sent 
EAGER now SQ_WAITING_EAGER_ACK.
[D 15:42:52.494707] Posted PVFS_SYS_READDIR (waiting for test)
[D 15:42:52.494730] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
[D 15:42:52.494750] check_cq: ack message da6:3332 my bufnum 4.
[D 15:42:52.494770] check_cq: sq 0x100f9150 SQ_WAITING_EAGER_ACK -> 
SQ_WAITING_USER_TEST.
[D 15:42:52.494790] test_sq: sq 0x100f9150 completed 44 to da6.
[D 15:42:52.494810] BMI_testcontext completing: 196
[D 15:42:52.494870] check_cq: found len 56 at da6:3332 my bufnum 4 type 
MSG_EAGER_SEND.
[D 15:42:52.494890] encourage_recv_incoming: recv eager my bufnum 4 his 
bufnum 4 len 56.
[D 15:42:52.494909] post_rr: da6:3332 bh 4.
[D 15:42:52.494929] post_sr_ack: da6:3332 bh 4 wr 9/1023.
[D 15:42:52.494948] encourage_recv_incoming: matched rq 0x101061f0 now 
RQ_EAGER_WAITING_USER_TEST.
[D 15:42:52.494967] encourage_recv_incoming: early registration not 
needed, dereg after eager.
[D 15:42:52.494987] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
[D 15:42:52.495007] test_rq: rq 0x101061f0 completed 40 from da6.
[D 15:42:52.495029] BMI_testcontext completing: 194
[D 15:42:52.495053] readdir_msg_comp_fn
[D 15:42:52.496207] readdir state: cleanup
[D 15:42:56.908679] PVFS_isys_getattr entered
[D 15:42:56.908723] (0x1013c950) getattr_setup_msgpair
[D 15:42:56.908855] BMI_post_recv: addr: 155, offset: 0x10429ea0, size: 
9280
[D 15:42:56.908877] BMI_ib_post_recv: expected len 9280 tag 6.
[D 15:42:56.908899] generic_post_recv: new rq 0x101061f0.
[D 15:42:56.908928] BMI_post_sendunexpected_list: addr: 155, count: 1, 
total_size: 40
[D 15:42:56.908947]    element 0: offset: 0x100d3600, size: 40
[D 15:42:56.908967] BMI_ib_post_sendunexpected_list: listlen 1 tag 6.
[D 15:42:56.908988] generic_post_send: new sq 0x100f9150.
[D 15:42:56.909007] encourage_send_waiting_buffer: sq 0x100f9150.
[D 15:42:56.909029] post_rr_ack: da6:3332 bh 5.
[D 15:42:56.909049] post_sr: da6:3332 bh 5 len 56 wr 10/1023.
[D 15:42:56.909070] encourage_send_waiting_buffer: sq 0x100f9150 sent 
EAGER now SQ_WAITING_EAGER_ACK.
[D 15:42:56.909088] Posted PVFS_SYS_GETATTR (waiting for test)
[D 15:42:56.909112] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
[D 15:42:56.909591] check_cq: ack message da6:3332 my bufnum 5.
[D 15:42:56.909610] check_cq: sq 0x100f9150 SQ_WAITING_EAGER_ACK -> 
SQ_WAITING_USER_TEST.
[D 15:42:56.909631] test_sq: sq 0x100f9150 completed 40 to da6.
*[D 15:42:56.909650] BMI_testcontext completing: 207
*[D 15:45:16.050430] check_cq: found len 88 at da6:3332 my bufnum 5 type 
MSG_EAGER_SEND.
[D 15:45:16.050466] encourage_recv_incoming: recv eager my bufnum 5 his 
bufnum 5 len 88.
[D 15:45:16.050486] post_rr: da6:3332 bh 5.
[D 15:45:16.050508] post_sr_ack: da6:3332 bh 5 wr 11/1023.
[D 15:45:16.050529] encourage_recv_incoming: matched rq 0x101061f0 now 
RQ_EAGER_WAITING_USER_TEST.
[D 15:45:16.050548] encourage_recv_incoming: early registration not 
needed, dereg after eager.
[D 15:45:16.050568] check_cq: da6:3332 periodic sr flush (qp or qp_ack).
[D 15:45:16.050589] test_rq: rq 0x101061f0 completed 72 from da6.
[D 15:45:16.050609] BMI_testcontext completing: 205
[D 15:45:16.050797] trying to add object reference to acache
[D 15:45:16.050816] (0x1013c950) getattr state: getattr_cleanup

-- 
Kyle Schochenmaier
kschoche at scl.ameslab.gov
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory 



More information about the Pvfs2-developers mailing list