[Pvfs2-developers] help debugging request processor/distribution

Phil Carns pcarns at wastedcycles.org
Mon Jun 12 14:01:34 EDT 2006


Hi Sam,

I may be wrong, but so far it looks like the problem is a little 
different this time.  In the scenario that I am seeing now it doesn't 
look like the flow has finished everything and then just failed to mark 
completion- it looks like one side actually posted more BMI messages 
than the other.  The server posted two receives of 256K, while the 
client only posted one send of 256K.  I don't see any zero byte results 
from the request processor or any cases where the flow should have 
marked completion.

I'm still trying to get a better log.  After adding mutexes in 
PINT_process_request() on the server side, the problem has gotten much 
harder to reproduce.  I don't know if there is any correlation or if I 
am just perturbing the timing enough to hide the problem.  So far out of 
9 runs I have had only 1 failure, and the logs from that failure were 
inconclusive.

I definitely agree with both of your points about reworking the flow 
code; it is difficult to debug right now.

-Phil

Sam Lang wrote:
> 
> On Jun 12, 2006, at 8:34 AM, Phil Carns wrote:
> 
>> Hi all,
>>
>> I am looking at an I/O problem that I don't completely understand.   
>> The setup is that there are 15 servers and 20 clients (all RHEL3  
>> SMP).  The clients are running a proprietary application.  At the  end 
>> of the run they each write their share of a data set into a 36  GB 
>> file.  So each is doing a contiguous write of 36/20 GB of data.
>>
>> Roughly 80-90% of the time, one or more of the clients and servers  
>> hang, though it isn't predictable which one will do it.  After  quite 
>> a bit of digging through logs, it turns out that there is a  flow that 
>> never completes on the server side.  There are two  problems 
>> contributing here:
>>
>> 1) The flow timeout mechanism doesn't seem to be working right for  me 
>> (the timer for the flow pops and checks progress one time, then  never 
>> gets triggered again).  This isn't the core problem, but it  is 
>> causing the flow to get stuck forever and hold up the request  
>> scheduler rather than giving up eventually.
>>
>> 2) For this particular flow, the client and server appear to  disagree 
>> on how much data to exchange.  The client sends one 256K  buffer and 
>> completes the flow.  The server receives that 256K  buffer but still 
>> expects at least one more to arrive (which never  does).
>>
> 
> Hi Phil, these problems sound very similar to the bug I was seeing  
> about a month ago with the flow code for certain request types.  The  
> particular case I was looking at was with reads, so given that  there's 
> a lot of code duplication in flow I could imagine the same  problem 
> existing (but not getting fixed) for writes.  It may not be  the same as 
> the problem you're seeing, but I'll walk through the bug  and how I 
> fixed it, in case it helps.
> 
> Basically, the request was ending right at the boundary of a flow  
> buffer (so bytes == bytemax), this caused the request processing code  
> to finish and the bmi_send_callback_fn would return, but it would get  
> called again by the post code for the next buffer, except this time  the 
> request processing would return 0 bytes, and this queue item  would get 
> marked as the last one.  It was that last queue item that  was the 
> problem: since the trove_bstream_read_list was never getting  posted, 
> the queue item who's last field was set, was never making it  into the 
> trove_read_callback_fn, so the dest_last_posted field for  the flow data 
> wasn't getting set either.
> 
> The fix was to add some extra checking in bmi_send_callback_fn.  In  the 
> case that the request processing returns 0 bytes processed, I can  set 
> the dest_last_posted field to 1 immediately.  The only caveat to  this 
> is in the case that other queue items in the source list haven't  made 
> it to the dest list yet, so we need to check for that case.  The  diff 
> of the changes I made might help:
> 
> http://www.pvfs.org/cgi-bin/pvfs2/viewcvs/viewcvs.cgi/pvfs2/src/io/ 
> flow/flowproto-bmi-trove/flowproto-multiqueue.c.diff?r1=1.105&r2=1.106
> 
> In any case, I wasn't able to make use of the debug messages for the  
> request processing code, it was too verbose to make any sense of it.   I 
> think I ended up just adding debug statements for the entry and  exit 
> points of the flow callbacks along with all the different side- state 
> fields as they were changed.  I'm not sure those debug  statements made 
> it into the trunk though. :-(
> 
> Since we're here, I think the flow code could benefit from some  
> reworking in two ways:
> 
> 1.  Less side-state variables to be set.  Each queue item has fields  
> that change the behavior of when things should complete.  Its hard to  
> keep track of all that.
> 
> 2.  More code abstraction.  We end up doing very similar things  between 
> reads and writes on the server (and on the client too),  especially with 
> respect to the request processing.  It might make  sense to abstract out 
> some of that code duplication into separate  functions.
> 
> -sam
> 
>> I am pretty sure I can get to the bottom of the first problem.  The  
>> second one is trickier.
>>
>> I attached a few log file excerpts:
>>
>> client-1st.txt: This is the request processing log output from the  
>> client side when it is working on the flow in question
>>
>> server-1st.txt: request processing log output from the first  
>> iteration of the server side flow
>>
>> server-2nd.txt: request processing log output from the second  
>> iteration of the server side flow
>>
>> Anyone have any suggestions, or see any indication from those log  
>> portions about why the server and client arn't on the same page?
>>
>> Some of this log output (the server-2nd.txt in particular?) looks a  
>> little jumbled, like maybe there are two flows doing request  
>> processing at once, but I don't really know how to interpret it.
>>
>> I may try adding a mutex around PINT_process_request() next, just  to 
>> see if I can make the logs clearer by making sure only one  invocation 
>> is going at once.
>>
>> -Phil
>> [D 06/09 15:49]  
>> =========================================================
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         tiling 5484 copies
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 0
>> [D 06/09 15:49]                 to 0 ta 22993719939 fi 22997914243
>> [D 06/09 15:49]                 mto 0 mta 0 mfi 4194304
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 0 contig_size = 23001563136  
>> lvl_flag =
>> 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 22993719939
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 22993719939
>> [D 06/09 15:49]                 to 22993719939 ta 22993719939 fi  
>> 22997914243
>> [D 06/09 15:49]                 mto 0 mta 0 mfi 4194304
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 22993719939 contig_size =  
>> 7843197
>> lvl_flag = 1
>> [D 06/09 15:49]         calling distribute
>> [D 06/09 15:49]         PINT_distribute
>> [D 06/09 15:49]                 of 22993719939 sz 4194304 ix 0 sm  64 
>> by 0 bm
>> 262144 fsz 0 exfl 1
>> [D 06/09 15:49]                 begin iteration loff: 22994157568
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]                 client lstof 0 diff 437629 sgof 437629
>> [D 06/09 15:49] **********CALL***PROCESS*********
>> [D 06/09 15:49]                 segment of 437629 sz 65536
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 0
>> [D 06/09 15:49]                 to 0 ta 437629 fi 503165
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 0 contig_size = 4194304  
>> lvl_flag = 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 437629
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 437629
>> [D 06/09 15:49]                 to 437629 ta 437629 fi 503165
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 437629 contig_size =  3756675 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 437629 sz 65536
>> [D 06/09 15:49]         add a segment 0
>> [D 06/09 15:49]         retval = 65536
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 1 sm 64 by 65536 bm 262144 ta  437629 
>> to 503165
>> fo 503165 eof 0
>> [D 06/09 15:49] *****RETURN***FROM***PROCESS*****
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 22995140608
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]                 client lstof 503165 diff 917504  sgof 
>> 1420669
>> [D 06/09 15:49] **********CALL***PROCESS*********
>> [D 06/09 15:49]                 segment of 1420669 sz 65536
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 503165
>> [D 06/09 15:49]                 to 503165 ta 1420669 fi 1486205
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 503165 contig_size =  3691139 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 917504
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 1420669
>> [D 06/09 15:49]                 to 1420669 ta 1420669 fi 1486205
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 1420669 contig_size =  2773635 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 1420669 sz 65536
>> [D 06/09 15:49]         add a segment 1
>> [D 06/09 15:49]         retval = 65536
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 2 sm 64 by 131072 bm 262144 ta  
>> 1420669 to
>> 1486205 fo 1486205 eof 0
>> [D 06/09 15:49] *****RETURN***FROM***PROCESS*****
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 22996123648
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]                 client lstof 1486205 diff 917504  sgof 
>> 2403709
>> [D 06/09 15:49] **********CALL***PROCESS*********
>> [D 06/09 15:49]                 segment of 2403709 sz 65536
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 1486205
>> [D 06/09 15:49]                 to 1486205 ta 2403709 fi 2469245
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 1486205 contig_size =  2708099 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 917504
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 2403709
>> [D 06/09 15:49]                 to 2403709 ta 2403709 fi 2469245
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 2403709 contig_size =  1790595 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2403709 sz 65536
>> [D 06/09 15:49]         add a segment 2
>> [D 06/09 15:49]         retval = 65536
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 3 sm 64 by 196608 bm 262144 ta  
>> 2403709 to
>> 2469245 fo 2469245 eof 0
>> [D 06/09 15:49] *****RETURN***FROM***PROCESS*****
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 22997106688
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]                 client lstof 2469245 diff 917504  sgof 
>> 3386749
>> [D 06/09 15:49] **********CALL***PROCESS*********
>> [D 06/09 15:49]                 segment of 3386749 sz 65536
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 2469245
>> [D 06/09 15:49]                 to 2469245 ta 3386749 fi 3452285
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 2469245 contig_size =  1725059 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 917504
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 3386749
>> [D 06/09 15:49]                 to 3386749 ta 3386749 fi 3452285
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 3386749 contig_size =  807555 
>> lvl_flag
>> = 1
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 3386749 sz 65536
>> [D 06/09 15:49]         add a segment 3
>> [D 06/09 15:49]         retval = 65536
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 4 sm 64 by 262144 bm 262144 ta  
>> 3386749 to
>> 3452285 fo 3452285 eof 0
>> [D 06/09 15:49] *****RETURN***FROM***PROCESS*****
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 done with segments or bytes
>> [D 06/09 15:49]                         of 22997172224 sz 742019 sg  4 
>> sm 64 by
>> 262144 bm 262144
>> [D 06/09 15:49]                         next loff: 22998089728 next  
>> poff:
>> 1533149184
>> [D 06/09 15:49]                         (return value) 4194304
>> [D 06/09 15:49]                 finished
>> [D 06/09 15:49]         retval = 4194304
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 4 sm 64 by 262144 bm 262144 ta  
>> 22993719939 to
>> 22997914243 fo 22997914243 eof 0
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         tiling 9156 copies
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 0
>> [D 06/09 15:49]                 to 0 ta 38396548819 fi 38400743123
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 0 contig_size = 38403047424  
>> lvl_flag =
>> 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 38396548819
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 38396548819
>> [D 06/09 15:49]                 to 38396548819 ta 38396548819 fi  
>> 38400743123
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 38396548819 contig_size =  
>> 6498605
>> lvl_flag = 1
>> [D 06/09 15:49]         calling distribute
>> [D 06/09 15:49]         PINT_distribute
>> [D 06/09 15:49]                 of 38396548819 sz 4194304 ix 0 sm  64 
>> by 0 bm
>> 262144 fsz 2559770624 exfl 1
>> [D 06/09 15:49]                 begin iteration loff: 38396690432
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2559770624 sz 65536
>> [D 06/09 15:49]         add a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 38397673472
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2559836160 sz 65536
>> [D 06/09 15:49]         combine a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 38398656512
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2559901696 sz 65536
>> [D 06/09 15:49]         combine a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 38399639552
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2559967232 sz 65536
>> [D 06/09 15:49]         combine a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 done with segments or bytes
>> [D 06/09 15:49]                         of 38399705088 sz 1038035  sg 
>> 1 sm 64
>> by 262144 bm 262144
>> [D 06/09 15:49]                         next loff: 38400622592 next  
>> poff:
>> 2560032768
>> [D 06/09 15:49]                         return value 3156269
>> [D 06/09 15:49]                 finished
>> [D 06/09 15:49]         retval = 3156269
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 1 sm 64 by 262144 bm 262144 ta  
>> 38396548819 to
>> 38399705088 fo 38400743123 eof 0
>> [D 06/09 15:49]  
>> =========================================================
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]         tiling 8489 copies
>> [D 06/09 15:49]         skipping ahead to target_offset
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 0
>> [D 06/09 15:49]                 to 0 ta 35599341710 fi 35603536014
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 0 contig_size = 35605446656  
>> lvl_flag =
>> 1
>> [D 06/09 15:49]         process logical skip
>> [D 06/09 15:49]         retval = 35599341710
>> [D 06/09 15:49]         exiting logical skip because distribute  
>> indicates done
>> [D 06/09 15:49]  
>> =========================================================
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49] PINT_process_request
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 35599341710
>> [D 06/09 15:49]         tiling 9156 copies
>> [D 06/09 15:49]                 to 35599341710 ta 35599341710 fi  
>> 35603536014
>> [D 06/09 15:49]         Do seq of 0 ne 4194304 st 0 nb 1 ub 4194304  
>> lb 0 as
>> 4194304 co 0
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]                 lvl 0 el 0 blk 0 by 38399705088
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]         contig_offset = 35599341710 contig_size =  
>> 6104946
>> lvl_flag = 1
>> [D 06/09 15:49]                 to 38399705088 ta 38396548819 fi  
>> 38400743123
>> [D 06/09 15:49]         calling distribute
>> [D 06/09 15:49]         PINT_distribute
>> [D 06/09 15:49]         basic type or contiguous data
>> [D 06/09 15:49]                 of 35599341710 sz 4194304 ix 0 sm  64 
>> by 0 bm
>> 262144 fsz 2559770624 exfl 1
>> [D 06/09 15:49]         Request disp
>> [D 06/09 15:49]                 begin iteration loff: 35599941632
>> [D 06/09 15:49]         contig_offset = 38399705088 contig_size =  
>> 3342336
>> lvl_flag = 1
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]         calling distribute
>> [D 06/09 15:49]                         of 2373320704 sz 65536
>> [D 06/09 15:49]         add a segment 0
>> [D 06/09 15:49]         PINT_distribute
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 35600924672
>> [D 06/09 15:49]                 of 38399705088 sz 1038035 ix 0 sm  64 
>> by 0 bm
>> 262144 fsz 2560032768 exfl 1
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2373386240 sz 65536
>> [D 06/09 15:49]         combine a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 begin iteration loff: 38400622592
>> [D 06/09 15:49]                 begin iteration loff: 35601907712
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2373451776 sz 65536
>> [D 06/09 15:49]                 file being extended
>> [D 06/09 15:49]         combine a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                 begin iteration loff: 35602890752
>> [D 06/09 15:49]                 frag extends beyond strip
>> [D 06/09 15:49]                         of 2560032768 sz 65536
>> [D 06/09 15:49]         process a segment
>> [D 06/09 15:49]                         of 2373517312 sz 65536
>> [D 06/09 15:49]         add a segment 0
>> [D 06/09 15:49]         combine a segment 0
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 end iteration
>> [D 06/09 15:49]                 done with segments or bytes
>> [D 06/09 15:49]                         of 35602956288 sz 579726 sg  1 
>> sm 64 by
>> 262144 bm 262144
>> [D 06/09 15:49]                         of 38400688128 sz 54995 sg  1 
>> sm 64 by
>> 65536 bm 262144
>> [D 06/09 15:49]                         next loff: 35603873792 next  
>> poff:
>> 2373582848
>> [D 06/09 15:49]                         (return value) 4194304
>> [D 06/09 15:49]                         next loff: 38401605632 next  
>> poff:
>> 2560098304
>> [D 06/09 15:49]                 finished
>> [D 06/09 15:49]                         (return value) 1038035
>> [D 06/09 15:49]         retval = 4194304
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]                 finished
>> [D 06/09 15:49]         done sg 1 sm 64 by 262144 bm 262144 ta  
>> 35599341710 to
>> 35603536014 fo 35603536014 eof 0
>> [D 06/09 15:49]         retval = 1038035
>> [D 06/09 15:49]  
>> =========================================================
>> [D 06/09 15:49]         exiting distribute returned less than expected
>> [D 06/09 15:49]         done sg 1 sm 64 by 65536 bm 262144 ta  
>> 38396548819 to
>> 38400743123 fo 38400743123 eof 0
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> 
> 



More information about the Pvfs2-developers mailing list