[lwhatley.ctr@navo.hpc.mil: Re: [Pvfs2-developers] Re: [Pvfs2-users] PVFS2 on Infiniband]

Lee Whatley, Contractor lwhatley.ctr at navo.hpc.mil
Mon Jun 5 11:53:41 EDT 2006


Pete Wyckoff wrote:
> If you want to try a couple of patches, just for kicks, we can maybe
> hack around this alleged starvation problem.  First, we can toss in
> a sched_yield() and see if it magically gets the kernel to make the
> dbpf thread run instead.  Near the bottom of the function
> BMI_ib_testcontext() in src/io/bmi/bmi_ib/ib.c, there's this code:

Pete, I think you're on to something!  Adding the sched_yield() call to 
the BMI_ib_testcontext() function has sped things up tremendously.  In 
addition, pvfs2-server now only shows 10% CPU usage (and not for nearly 
as long as previously) in top when doing the pvfs2-mkdir command.

I've attached the pvfs2-server.log for a pvfs2-mkdir after adding the 
sync_yield call.  The fact that a mkdir still takes 10% of the CPU still 
bothers me, so I'm wondering if there is still some small thread 
starvation problem somewhere else.  What do you think?

Thanks,
-Lee

-------------- next part --------------
[D 10:46:07.270928] PVFS2 Server version 1.4.1pre1-2006-05-22-144949 starting.
[D 10:46:07.270928] Passing ib://scn1-ib0:3334 as BMI listen address.
[D 10:46:07.270928] BMI_ib_initialize: init.
[D 10:46:07.280928] BMI_ib_initialize: built pd 5a4e90.
[D 10:46:07.280928] BMI_ib_initialize: max 16256 completion queue entries.
[D 10:46:07.280928] BMI_ib_initialize: asking for 1024 completion queue entries.
[D 10:46:07.280928] BMI_ib_initialize: done.
[D 10:46:07.280928] dbpf_thread_function started
[D 10:46:07.280928] dbpf_thread_initialize: initialized
[D 10:46:07.280928] collection lookup: version is 0.1.2
[D 10:46:07.280928] - set handle re-use timeout to 360 seconds (ret=0)
[D 10:46:07.440931] File system pvfs2-fs1 using handles:
	4-2147483650,2147483651-4294967297
[D 10:46:07.440931] Sync on metadata update for pvfs2-fs1: no
[D 10:46:07.440931] Sync on I/O data update for pvfs2-fs1: no
[D 10:46:07.440931] Storage Init Complete (aio-threaded)
[D 10:46:07.440931] 1 filesystem(s) initialized
[D 10:46:07.440931] Initialization completed successfully.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 0.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 1.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 2.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 3.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 4.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 5.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 6.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 7.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 8.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 9.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 10.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 11.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 12.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 13.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 14.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 15.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 16.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 17.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 18.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 19.
[D 10:46:11.401021] ib_tcp_server_check_new_connections: accepted new connection 11.0.0.1:49068 at server.
[D 10:46:11.401021] check_cq: found something.
[D 10:46:11.401021] check_cq: found len 40 at 11.0.0.1:49068 my bufnum 0 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:11.401021] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:11.401021] encourage_recv_incoming: recv eager unexpected my bufnum 0 his bufnum 0 len 40.
[D 10:46:11.401021] encourage_recv_incoming: new rq 0x892de0 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:11.401021] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:11.401021] post_rr: 11.0.0.1:49068 bh 0.
[D 10:46:11.401021] post_sr_ack: 11.0.0.1:49068 his buf 0 wr 0/8191.
[D 10:46:11.401021] (0x818330) getconfig (prelude sm) state: req_sched
[D 10:46:11.401021] (0x818330) getconfig (prelude sm) state: getattr_if_needed
[D 10:46:11.401021] (0x818330) getconfig (prelude sm) state: perm_check (status = 0)
[D 10:46:11.401021] (0x818330) getconfig state: init
[D 10:46:11.401021] (0x818330) getconfig (FR sm) state: release: (error_code = 0)
[D 10:46:11.401021] (0x818330) getconfig (FR sm) state: send_resp (status = 0)
[D 10:46:11.411021] ib_mem_register: buf 0x892ea0 len 32808.
[D 10:46:11.411021] BMI_post_send_list: addr: 59, count: 1, total_size: 1040
[D 10:46:11.411021]    element 0: offset: 0x892ea0, size: 1040
[D 10:46:11.411021] BMI_ib_post_send_list: listlen 1 tag 1.
[D 10:46:11.411021] generic_post_send: new sq 0x89aed0.
[D 10:46:11.411021] encourage_send_waiting_buffer: sq 0x89aed0.
[D 10:46:11.411021] post_rr_ack: 11.0.0.1:49068 bh 0.
[D 10:46:11.411021] post_sr: 11.0.0.1:49068 bh 0 len 1056 wr 0/8191.
[D 10:46:11.411021] encourage_send_waiting_buffer: sq 0x89aed0 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:11.411021] check_cq: found something.
[D 10:46:11.411021] check_cq: ack message 11.0.0.1:49068 my bufnum 0.
[D 10:46:11.411021] check_cq: sq 0x89aed0 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:11.411021] test_sq: sq 0x89aed0 completed 1040 to 11.0.0.1:49068.
[D 10:46:11.411021] BMI_testcontext completing: 63
[D 10:46:11.411021] (0x818330) getconfig (FR sm) state: cleanup
[D 10:46:11.411021] check_cq: found something.
[D 10:46:11.411021] check_cq: found len 56 at 11.0.0.1:49068 my bufnum 1 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:11.411021] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:11.411021] encourage_recv_incoming: recv eager unexpected my bufnum 1 his bufnum 1 len 56.
[D 10:46:11.411021] encourage_recv_incoming: new rq 0x819390 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:11.411021] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:11.411021] post_rr: 11.0.0.1:49068 bh 1.
[D 10:46:11.411021] ib_mem_deregister: buf 0x892ea0 len 32808 lkey a13ba800 rkey a13ba800.
[D 10:46:11.411021] (0x818330) getconfig state: cleanup
[D 10:46:11.411021] post_sr_ack: 11.0.0.1:49068 his buf 1 wr 1/8191.
[D 10:46:11.411021] (0x819970) mgmt_setparam (prelude sm) state: req_sched
[D 10:46:11.411021] (0x819970) mgmt_setparam (prelude sm) state: getattr_if_needed
[D 10:46:11.411021] (0x819970) mgmt_setparam (prelude sm) state: perm_check (status = 0)
[D 10:46:11.411021] (0x819970) mgmt_setparam state: work
[D 10:46:11.411021] (0x819970) mgmt_setparam (FR sm) state: release: (error_code = 0)
[D 10:46:11.411021] (0x819970) mgmt_setparam (FR sm) state: send_resp (status = 0)
[D 10:46:11.411021] BMI_post_send_list: addr: 59, count: 1, total_size: 24
[D 10:46:11.411021]    element 0: offset: 0x892e80, size: 24
[D 10:46:11.411021] BMI_ib_post_send_list: listlen 1 tag 2.
[D 10:46:11.411021] generic_post_send: new sq 0x818330.
[D 10:46:11.411021] encourage_send_waiting_buffer: sq 0x818330.
[D 10:46:11.411021] post_rr_ack: 11.0.0.1:49068 bh 1.
[D 10:46:11.411021] post_sr: 11.0.0.1:49068 bh 1 len 40 wr 1/8191.
[D 10:46:11.411021] encourage_send_waiting_buffer: sq 0x818330 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:11.411021] check_cq: found something.
[D 10:46:11.411021] check_cq: ack message 11.0.0.1:49068 my bufnum 1.
[D 10:46:11.411021] check_cq: sq 0x818330 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:11.411021] test_sq: sq 0x818330 completed 24 to 11.0.0.1:49068.
[D 10:46:11.411021] BMI_testcontext completing: 68
[D 10:46:11.411021] (0x819970) mgmt_setparam (FR sm) state: cleanup
[D 10:46:11.411021] check_cq: found something.
[D 10:46:11.411021] check_cq: found len 8 at 11.0.0.1:49068 my bufnum 2 type MSG_BYE.
[D 10:46:11.411021] encourage_recv_incoming: incoming msg type MSG_BYE.
[D 10:46:11.411021] encourage_recv_incoming: recv BYE my bufnum 2 len 8.
[D 10:46:11.411021] ib_close_connection: closing connection to 11.0.0.1:49068.
[A 10:46:11.411021] root.root at 11.0.0.1:49068 H=0 S=0x819970: mgmt_setparam: finish (Success)
[D 10:46:11.411021] (0x819970) mgmt_setparam state: cleanup
[D 10:46:11.411021]  - setparam returning 0
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 0.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 1.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 2.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 3.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 4.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 5.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 6.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 7.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 8.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 9.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 10.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 11.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 12.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 13.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 14.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 15.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 16.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 17.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 18.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 19.
[D 10:46:16.491136] ib_tcp_server_check_new_connections: accepted new connection 11.0.0.1:49069 at server.
[D 10:46:16.491136] check_cq: found something.
[D 10:46:16.491136] check_cq: found len 40 at 11.0.0.1:49069 my bufnum 0 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.491136] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.491136] encourage_recv_incoming: recv eager unexpected my bufnum 0 his bufnum 0 len 40.
[D 10:46:16.491136] encourage_recv_incoming: new rq 0x872860 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:16.491136] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 0.
[D 10:46:16.491136] post_sr_ack: 11.0.0.1:49069 his buf 0 wr 0/8191.
[D 10:46:16.491136] (0x81ab10) getconfig (prelude sm) state: req_sched
[A 10:46:16.491136] root.root at 11.0.0.1:49069 H=0 S=0x81ab10: getconfig: request
[A 10:46:16.491136] root.root at 11.0.0.1:49069 H=0 S=0x81ab10: getconfig: start
[D 10:46:16.491136] (0x81ab10) getconfig (prelude sm) state: getattr_if_needed
[D 10:46:16.491136] PVFS operation "getconfig" got attr mask 127
	(attr_uid_valid? yes, attr_owner = 0, credentials_uid = 0)
	(attr_gid_valid? yes, attr_group = 0, credentials.gid = 0)
[D 10:46:16.491136] Final permission check for "getconfig" set error code to 0
[D 10:46:16.491136] (0x81ab10) getconfig (prelude sm) state: perm_check (status = 0)
[D 10:46:16.491136] (0x81ab10) getconfig state: init
[D 10:46:16.491136] (0x81ab10) getconfig (FR sm) state: release: (error_code = 0)
[D 10:46:16.491136] (0x81ab10) getconfig (FR sm) state: send_resp (status = 0)
[D 10:46:16.491136] ib_mem_register: buf 0x89c180 len 32808.
[D 10:46:16.491136] BMI_post_send_list: addr: 80, count: 1, total_size: 1040
[D 10:46:16.491136]    element 0: offset: 0x89c180, size: 1040
[D 10:46:16.491136] BMI_ib_post_send_list: listlen 1 tag 1.
[D 10:46:16.491136] generic_post_send: new sq 0x872860.
[D 10:46:16.491136] encourage_send_waiting_buffer: sq 0x872860.
[D 10:46:16.491136] post_rr_ack: 11.0.0.1:49069 bh 0.
[D 10:46:16.491136] post_sr: 11.0.0.1:49069 bh 0 len 1056 wr 0/8191.
[D 10:46:16.491136] encourage_send_waiting_buffer: sq 0x872860 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:16.491136] check_cq: found something.
[D 10:46:16.491136] check_cq: ack message 11.0.0.1:49069 my bufnum 0.
[D 10:46:16.491136] check_cq: sq 0x872860 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:16.491136] test_sq: sq 0x872860 completed 1040 to 11.0.0.1:49069.
[D 10:46:16.491136] BMI_testcontext completing: 84
[D 10:46:16.491136] (0x81ab10) getconfig (FR sm) state: cleanup
[A 10:46:16.491136] root.root at 11.0.0.1:49069 H=0 S=0x81ab10: getconfig: finish (Success)
[D 10:46:16.491136] ib_mem_deregister: buf 0x89c180 len 32808 lkey 213b9100 rkey 213b9100.
[D 10:46:16.491136] (0x81ab10) getconfig state: cleanup
[D 10:46:16.491136] check_cq: found something.
[D 10:46:16.491136] check_cq: found len 56 at 11.0.0.1:49069 my bufnum 1 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.491136] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.491136] encourage_recv_incoming: recv eager unexpected my bufnum 1 his bufnum 1 len 56.
[D 10:46:16.491136] encourage_recv_incoming: new rq 0x872860 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:16.491136] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:16.491136] post_rr: 11.0.0.1:49069 bh 1.
[D 10:46:16.491136] post_sr_ack: 11.0.0.1:49069 his buf 1 wr 1/8191.
[D 10:46:16.491136] (0x81bcb0) getattr (prelude sm) state: req_sched
[A 10:46:16.491136] root.root at 11.0.0.1:49069 H=1048576 S=0x81bcb0: getattr: request
[D 10:46:16.491136] REQ SCHED POSTING, handle: 1048576, queue_element: 0x819e20
[D 10:46:16.491136] REQ SCHED SCHEDULING, handle: 1048576, queue_element: 0x819e20
[A 10:46:16.491136] root.root at 11.0.0.1:49069 H=1048576 S=0x81bcb0: getattr: start
[D 10:46:16.491136] (0x81bcb0) getattr (prelude sm) state: getattr_if_needed
[D 10:46:16.491136] About to retrieve attributes for handle 1048576
[D 10:46:16.491136] dbpf_open_cache_get: called
[D 10:46:16.491136] dbpf_open_cache_get: create_flag: 0
[D 10:46:16.491136] dbpf_open_cache_get: resetting entry from free list.
[D 10:46:16.491136] dbpf_open_cache open_fd: opening fd 1048576, create: 0
[D 10:46:16.491136] dbpf_open_cache open_fd: filename: //pvfs-1/6292076b/bstreams/00000000/00100000.bstream
[D 10:46:16.491136] dbpf_open_cache_get: could not open (ret=-1073742082)
[D 10:46:16.501136] ATTRIB: retrieved attributes from DISK for key 1048576
	uid = 0, mode = 511, type = 4, dfile_count = 0, dist_size = 0
	b_size = 0, k_size = 2980
[D 10:46:16.501136] dbpf_attr_cache_insert: inserting 1048576 (k_size is 2980 | b_size is 0)
[D 10:46:16.501136] PVFS operation "getattr" got attr mask 127
	(attr_uid_valid? yes, attr_owner = 0, credentials_uid = 0)
	(attr_gid_valid? yes, attr_group = 0, credentials.gid = 0)
[D 10:46:16.501136] Final permission check for "getattr" set error code to 0
[D 10:46:16.501136] (0x81bcb0) getattr (prelude sm) state: perm_check (status = 0)
[D 10:46:16.501136] (0x81bcb0) getattr state: verify_attribs
[D 10:46:16.501136]  VERSION is 4937160830552309056, mtime is 1149522334
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=1048576 S=0x81bcb0: getattr: type: directory
[D 10:46:16.501136]  getattr: dirent_count not needed.
[D 10:46:16.501136] (0x81bcb0) getattr state: setup_resp
[D 10:46:16.501136] - RETURNING retrieved attrs: [owner = 0, group = 0
	perms = 777, type = 4, atime = 1149522322, mtime = 1149522334
	ctime = 1149522334, dfile_count = 0, dist_size = 0]
[D 10:46:16.501136] @ End DIRECTORY attributes: sending status 0 (error = 0)
[D 10:46:16.501136] (0x81bcb0) getattr (FR sm) state: release: (error_code = 0)
[D 10:46:16.501136] REQ SCHED RELEASING, handle: 1048576, queue_element: 0x819e20
[D 10:46:16.501136] (0x81bcb0) getattr (FR sm) state: send_resp (status = 0)
[D 10:46:16.501136] ib_mem_register: buf 0x8953c0 len 9280.
[D 10:46:16.501136] BMI_post_send_list: addr: 80, count: 1, total_size: 64
[D 10:46:16.501136]    element 0: offset: 0x8953c0, size: 64
[D 10:46:16.501136] BMI_ib_post_send_list: listlen 1 tag 2.
[D 10:46:16.501136] generic_post_send: new sq 0x872890.
[D 10:46:16.501136] encourage_send_waiting_buffer: sq 0x872890.
[D 10:46:16.501136] post_rr_ack: 11.0.0.1:49069 bh 1.
[D 10:46:16.501136] post_sr: 11.0.0.1:49069 bh 1 len 80 wr 1/8191.
[D 10:46:16.501136] encourage_send_waiting_buffer: sq 0x872890 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:16.501136] check_cq: found something.
[D 10:46:16.501136] check_cq: ack message 11.0.0.1:49069 my bufnum 1.
[D 10:46:16.501136] check_cq: sq 0x872890 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:16.501136] test_sq: sq 0x872890 completed 64 to 11.0.0.1:49069.
[D 10:46:16.501136] BMI_testcontext completing: 91
[D 10:46:16.501136] (0x81bcb0) getattr (FR sm) state: cleanup
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=1048576 S=0x81bcb0: getattr: finish (Success)
[D 10:46:16.501136] ib_mem_deregister: buf 0x8953c0 len 9280 lkey 493b8c00 rkey 493b8c00.
[D 10:46:16.501136] check_cq: found something.
[D 10:46:16.501136] (0x81bcb0) getattr state: getattr_cleanup
[D 10:46:16.501136] check_cq: found len 120 at 11.0.0.1:49069 my bufnum 2 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.501136] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.501136] encourage_recv_incoming: recv eager unexpected my bufnum 2 his bufnum 2 len 120.
[D 10:46:16.501136] encourage_recv_incoming: new rq 0x819a90 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:16.501136] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:16.501136] post_rr: 11.0.0.1:49069 bh 2.
[D 10:46:16.501136] post_sr_ack: 11.0.0.1:49069 his buf 2 wr 2/8191.
[D 10:46:16.501136] (0x81ce50) mkdir (prelude sm) state: req_sched
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=0 S=0x81ce50: mkdir: request
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=0 S=0x81ce50: mkdir: start
[D 10:46:16.501136] (0x81ce50) mkdir (prelude sm) state: getattr_if_needed
[D 10:46:16.501136] PVFS operation "mkdir" got attr mask 127
	(attr_uid_valid? yes, attr_owner = 0, credentials_uid = 0)
	(attr_gid_valid? yes, attr_group = 0, credentials.gid = 0)
[D 10:46:16.501136] Final permission check for "mkdir" set error code to 0
[D 10:46:16.501136] (0x81ce50) mkdir (prelude sm) state: perm_check (status = 0)
[D 10:46:16.501136] (0x81ce50) mkdir state: prep_sm
[D 10:46:16.501136]  attrs to write: [owner = 0, group = 0, perms = 755]
[D 10:46:16.501136] (0x81ce50) mkdir state: create
[D 10:46:16.501136]  creating dspace on coll_id 1653737323
[D 10:46:16.501136]  directory handle will be in starting hrange[0] 4-2147483650
[D 10:46:16.501136] [1 extents] -- new_handle is 1054212 (cur_extent is 4 - 2147483650)
[D 10:46:16.501136] dbpf_attr_cache_insert: inserting 1054212 (k_size is 0 | b_size is 0)
[D 10:46:16.501136] (0x81ce50) mkdir state: setattrib
[D 10:46:16.501136]  *** new directory has handle 1054212
[D 10:46:16.501136]  setting directory version to 1149522376
	mtime is 0
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1054212 returned (num_elems=2)
[D 10:46:16.501136] Updating cached attributes for key 1054212
[D 10:46:16.501136] (0x81ce50) mkdir state: create_dirdata_dspace
[D 10:46:16.501136]  creating dspace on coll_id 1653737323
[D 10:46:16.501136]  dirdata handle will be in starting hrange[0] 4-2147483650
[D 10:46:16.501136] [1 extents] -- new_handle is 1054208 (cur_extent is 4 - 2147483650)
[D 10:46:16.501136] dbpf_attr_cache_insert: inserting 1054208 (k_size is 0 | b_size is 0)
[D 10:46:16.501136] (0x81ce50) mkdir state: write_dirdata_handle
[D 10:46:16.501136]  *** new dirdata has handle 1054208
[D 10:46:16.501136]   writing dirdata handle (1054208) into parent dir (1054212)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1054212 returned (num_elems=3)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1054212 returned (num_elems=3)
[D 10:46:16.501136] Setting data 0x81dab8 based on key 1054212 and key_str dir_ent (data_sz=8)
[D 10:46:16.501136] *** cached keyval data written (key is dir_ent)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1054212 returned (num_elems=3)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1054212 returned (num_elems=3)
[D 10:46:16.501136] ** CANNOT cache data written (key is dirdata_size)
[D 10:46:16.501136] (0x81ce50) mkdir (FR sm) state: release: (error_code = 0)
[D 10:46:16.501136] (0x81ce50) mkdir (FR sm) state: send_resp (status = 0)
[D 10:46:16.501136] BMI_post_send_list: addr: 80, count: 1, total_size: 24
[D 10:46:16.501136]    element 0: offset: 0x813e30, size: 24
[D 10:46:16.501136] BMI_ib_post_send_list: listlen 1 tag 3.
[D 10:46:16.501136] generic_post_send: new sq 0x819f50.
[D 10:46:16.501136] encourage_send_waiting_buffer: sq 0x819f50.
[D 10:46:16.501136] post_rr_ack: 11.0.0.1:49069 bh 2.
[D 10:46:16.501136] post_sr: 11.0.0.1:49069 bh 2 len 40 wr 2/8191.
[D 10:46:16.501136] encourage_send_waiting_buffer: sq 0x819f50 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:16.501136] check_cq: found something.
[D 10:46:16.501136] check_cq: ack message 11.0.0.1:49069 my bufnum 2.
[D 10:46:16.501136] check_cq: sq 0x819f50 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:16.501136] test_sq: sq 0x819f50 completed 24 to 11.0.0.1:49069.
[D 10:46:16.501136] BMI_testcontext completing: 101
[D 10:46:16.501136] (0x81ce50) mkdir (FR sm) state: cleanup
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=0 S=0x81ce50: mkdir: finish (Success)
[D 10:46:16.501136] (0x81ce50) mkdir state: cleanup
[D 10:46:16.501136] check_cq: found something.
[D 10:46:16.501136] check_cq: found len 104 at 11.0.0.1:49069 my bufnum 3 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.501136] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:16.501136] encourage_recv_incoming: recv eager unexpected my bufnum 3 his bufnum 3 len 104.
[D 10:46:16.501136] encourage_recv_incoming: new rq 0x872860 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:16.501136] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:16.501136] post_rr: 11.0.0.1:49069 bh 3.
[D 10:46:16.501136] post_sr_ack: 11.0.0.1:49069 his buf 3 wr 3/8191.
[D 10:46:16.501136] (0x81dff0) crdirent (prelude sm) state: req_sched
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=1048576 S=0x81dff0: crdirent: request
[D 10:46:16.501136] REQ SCHED POSTING, handle: 1048576, queue_element: 0x819af0
[D 10:46:16.501136] REQ SCHED SCHEDULING, handle: 1048576, queue_element: 0x819af0
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=1048576 S=0x81dff0: crdirent: start
[D 10:46:16.501136] (0x81dff0) crdirent (prelude sm) state: getattr_if_needed
[D 10:46:16.501136] About to retrieve attributes for handle 1048576
[D 10:46:16.501136] dspace_getattr fast path attr cache hit on 1048576
 (dfile_count=0 | dist_size=0 | data_size=0)
[D 10:46:16.501136] PVFS operation "crdirent" got attr mask 127
	(attr_uid_valid? yes, attr_owner = 0, credentials_uid = 0)
	(attr_gid_valid? yes, attr_group = 0, credentials.gid = 0)
[D 10:46:16.501136]  - check_mode called --- (uid=0,gid=0,access_type=2)
[D 10:46:16.501136]  - object attributes --- (uid=0,gid=0,mode=511)
[D 10:46:16.501136]  - checking if uid (0) is root ...
[D 10:46:16.501136]  - yes
[D 10:46:16.501136]  - check_mode called --- (uid=0,gid=0,access_type=1)
[D 10:46:16.501136]  - object attributes --- (uid=0,gid=0,mode=511)
[D 10:46:16.501136]  - checking if uid (0) is root ...
[D 10:46:16.501136]  - yes
[D 10:46:16.501136] Final permission check for "crdirent" set error code to 0
[D 10:46:16.501136] (0x81dff0) crdirent (prelude sm) state: perm_check (status = 0)
[D 10:46:16.501136] (0x81dff0) crdirent state: setup_op
[A 10:46:16.501136] root.root at 11.0.0.1:49069 H=1048576 S=0x81dff0: crdirent: crdirent entry: test points to 1054212
[D 10:46:16.501136] (0x81dff0) crdirent state: validate
[D 10:46:16.501136]   got crdirent for test (with handle 1054212) in 1048576
[D 10:46:16.501136] (0x81dff0) crdirent state: read_directory_entry_handle
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] Setting data 0x81ec60 based on key 1048576 and key_str dir_ent (data_sz=8)
[D 10:46:16.501136] *** cached keyval data retrieved (key is dir_ent)
[D 10:46:16.501136] (0x81dff0) crdirent state: check_exist_directory_entry
[D 10:46:16.501136]   checking existence of new directory entry for test (handle = 1054212) to dirdata dspace 4
[D 10:46:16.501136] (0x81dff0) crdirent state: write_directory_entry
[D 10:46:16.501136]   writing new directory entry for test (handle = 1054212) to dirdata dspace 4
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] ** CANNOT cache data retrieved (key is dirdata_size)
[D 10:46:16.501136] incrementing dirent_count: fsid: 1653737323, handle: 1048576, count: 8
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] ** CANNOT cache data written (key is dirdata_size)
[D 10:46:16.501136] (0x81dff0) crdirent state: update_directory_entry
[D 10:46:16.501136] updating parent dir version to 4937161010940716432
	mtime is 1149522376
[D 10:46:16.501136] dbpf_cache_elem_lookup: cache elem matching 1048576 returned (num_elems=3)
[D 10:46:16.501136] Updating cached attributes for key 1048576
[D 10:46:16.501136] (0x81dff0) crdirent (FR sm) state: release: (error_code = 0)
[D 10:46:16.501136] REQ SCHED RELEASING, handle: 1048576, queue_element: 0x819af0
[D 10:46:16.501136] (0x81dff0) crdirent (FR sm) state: send_resp (status = 0)
[D 10:46:16.501136] BMI_post_send_list: addr: 80, count: 1, total_size: 16
[D 10:46:16.501136]    element 0: offset: 0x872770, size: 16
[D 10:46:16.501136] BMI_ib_post_send_list: listlen 1 tag 4.
[D 10:46:16.501136] generic_post_send: new sq 0x872860.
[D 10:46:16.501136] encourage_send_waiting_buffer: sq 0x872860.
[D 10:46:16.501136] post_rr_ack: 11.0.0.1:49069 bh 3.
[D 10:46:16.501136] post_sr: 11.0.0.1:49069 bh 3 len 32 wr 3/8191.
[D 10:46:16.501136] encourage_send_waiting_buffer: sq 0x872860 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:16.501136] check_cq: found something.
[D 10:46:16.501136] check_cq: ack message 11.0.0.1:49069 my bufnum 3.
[D 10:46:16.501136] check_cq: sq 0x872860 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:16.501136] check_cq: found something.
[D 10:46:16.501136] check_cq: found len 8 at 11.0.0.1:49069 my bufnum 4 type MSG_BYE.
[D 10:46:16.501136] encourage_recv_incoming: incoming msg type MSG_BYE.
[D 10:46:16.501136] encourage_recv_incoming: recv BYE my bufnum 4 len 8.
[D 10:46:16.501136] ib_close_connection: closing connection to 11.0.0.1:49069.
[D 10:46:16.501136] ib_close_connection: refcnt non-zero 1, delaying free.
[D 10:46:16.501136] test_sq: sq 0x872860 completed 16 to 11.0.0.1:49069.
[D 10:46:16.501136] ib_close_connection: closing connection to 11.0.0.1:49069.
[D 10:46:16.501136] BMI_testcontext completing: 113
[D 10:46:16.501136] (0x81dff0) crdirent (FR sm) state: cleanup
[A 10:46:16.501136] root.root@(unconnected) H=1048576 S=0x81dff0: crdirent: finish (Success)
[D 10:46:16.501136] (0x81dff0) crdirent state: cleanup
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 0.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 1.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 2.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 3.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 4.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 5.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 6.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 7.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 8.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 9.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 10.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 11.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 12.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 13.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 14.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 15.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 16.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 17.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 18.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 19.
[D 10:46:23.141286] ib_tcp_server_check_new_connections: accepted new connection 11.0.0.1:49077 at server.
[D 10:46:23.141286] check_cq: found something.
[D 10:46:23.141286] check_cq: found len 40 at 11.0.0.1:49077 my bufnum 0 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:23.141286] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:23.141286] encourage_recv_incoming: recv eager unexpected my bufnum 0 his bufnum 0 len 40.
[D 10:46:23.141286] encourage_recv_incoming: new rq 0x818b00 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:23.141286] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 0.
[D 10:46:23.141286] post_sr_ack: 11.0.0.1:49077 his buf 0 wr 0/8191.
[D 10:46:23.141286] (0x81f190) getconfig (prelude sm) state: req_sched
[A 10:46:23.141286] root.root at 11.0.0.1:49077 H=0 S=0x81f190: getconfig: request
[A 10:46:23.141286] root.root at 11.0.0.1:49077 H=0 S=0x81f190: getconfig: start
[D 10:46:23.141286] (0x81f190) getconfig (prelude sm) state: getattr_if_needed
[D 10:46:23.141286] PVFS operation "getconfig" got attr mask 127
	(attr_uid_valid? yes, attr_owner = 0, credentials_uid = 0)
	(attr_gid_valid? yes, attr_group = 0, credentials.gid = 0)
[D 10:46:23.141286] Final permission check for "getconfig" set error code to 0
[D 10:46:23.141286] (0x81f190) getconfig (prelude sm) state: perm_check (status = 0)
[D 10:46:23.141286] (0x81f190) getconfig state: init
[D 10:46:23.141286] (0x81f190) getconfig (FR sm) state: release: (error_code = 0)
[D 10:46:23.141286] (0x81f190) getconfig (FR sm) state: send_resp (status = 0)
[D 10:46:23.141286] ib_mem_register: buf 0x89c180 len 32808.
[D 10:46:23.141286] BMI_post_send_list: addr: 126, count: 1, total_size: 1040
[D 10:46:23.141286]    element 0: offset: 0x89c180, size: 1040
[D 10:46:23.141286] BMI_ib_post_send_list: listlen 1 tag 1.
[D 10:46:23.141286] generic_post_send: new sq 0x8201f0.
[D 10:46:23.141286] encourage_send_waiting_buffer: sq 0x8201f0.
[D 10:46:23.141286] post_rr_ack: 11.0.0.1:49077 bh 0.
[D 10:46:23.141286] post_sr: 11.0.0.1:49077 bh 0 len 1056 wr 0/8191.
[D 10:46:23.141286] encourage_send_waiting_buffer: sq 0x8201f0 sent EAGER now SQ_WAITING_EAGER_ACK.
[D 10:46:23.141286] check_cq: found something.
[D 10:46:23.141286] check_cq: ack message 11.0.0.1:49077 my bufnum 0.
[D 10:46:23.141286] check_cq: sq 0x8201f0 SQ_WAITING_EAGER_ACK -> SQ_WAITING_USER_TEST.
[D 10:46:23.141286] test_sq: sq 0x8201f0 completed 1040 to 11.0.0.1:49077.
[D 10:46:23.141286] BMI_testcontext completing: 130
[D 10:46:23.141286] (0x81f190) getconfig (FR sm) state: cleanup
[A 10:46:23.141286] root.root at 11.0.0.1:49077 H=0 S=0x81f190: getconfig: finish (Success)
[D 10:46:23.141286] ib_mem_deregister: buf 0x89c180 len 32808 lkey 513b8c00 rkey 513b8c00.
[D 10:46:23.141286] (0x81f190) getconfig state: cleanup
[D 10:46:23.141286] check_cq: found something.
[D 10:46:23.141286] check_cq: found len 56 at 11.0.0.1:49077 my bufnum 1 type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:23.141286] encourage_recv_incoming: incoming msg type MSG_EAGER_SENDUNEXPECTED.
[D 10:46:23.141286] encourage_recv_incoming: recv eager unexpected my bufnum 1 his bufnum 1 len 56.
[D 10:46:23.141286] encourage_recv_incoming: new rq 0x81a8b0 now RQ_EAGER_WAITING_USER_TESTUNEXPECTED.
[D 10:46:23.141286] BMI_ib_testunexpected: found waiting testunexpected.
[D 10:46:23.141286] post_rr: 11.0.0.1:49077 bh 1.
[D 10:46:23.141286] post_sr_ack: 11.0.0.1:49077 his buf 1 wr 1/8191.
[D 10:46:23.141286] (0x820330) mgmt_setparam (prelude sm) state: req_sched
[A 10:46:23.141286] root.root at 11.0.0.1:49077 H=0 S=0x820330: mgmt_setparam: request
[A 10:46:23.141286] root.root at 11.0.0.1:49077 H=0 S=0x820330: mgmt_setparam: start
[D 10:46:23.141286] (0x820330) mgmt_setparam (prelude sm) state: getattr_if_needed
[D 10:46:23.141286] PVFS operation "mgmt_setparam" got attr mask 127
	(attr_uid_valid? yes, attr_owner = 0, credentials_uid = 0)
	(attr_gid_valid? yes, attr_group = 0, credentials.gid = 0)
[D 10:46:23.141286] Final permission check for "mgmt_setparam" set error code to 0
[D 10:46:23.141286] (0x820330) mgmt_setparam (prelude sm) state: perm_check (status = 0)
[D 10:46:23.141286] (0x820330) mgmt_setparam state: work
[E 10:46:27.581386] 
PVFS2 server got signal 15 (server_status_flag: 262143)


More information about the Pvfs2-developers mailing list