[Pvfs2-developers] server crash on startup with millions of files

Sam Lang slang at mcs.anl.gov
Fri Feb 23 09:30:27 EST 2007


Sounds good.  I've attached the patch for those that need it.

-sam

-------------- next part --------------
A non-text attachment was scrubbed...
Name: race.patch
Type: application/octet-stream
Size: 1265 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20070223/08372b0d/race.obj
-------------- next part --------------

On Feb 23, 2007, at 8:15 AM, Phil Carns wrote:

> I tried out Pete's suggestion and all of the nodes ran overnight  
> without any trouble, about 7 million scans per server so far.
>
> The only modification was to the DBPF_COMPLETION_START macro:  
> getting the queue mutex first and also doing a queue add before  
> touching the state.
>
> I think the change is safe to commit if it looks ok on your end...
>
> -Phil
>
> Phil Carns wrote:
>> It ended up taking a little work to get another environment to  
>> trigger this reliably, but I think I have something now.
>> I modified the iterate_handles() function a bit so that it keeps  
>> scanning over and over again indefinitely rather than letting the  
>> server start up.  This forces the code path in question without  
>> having to restart the servers.  Using this setup I'm able to  
>> trigger it on an empty 8 node file system, but I have to leave all  
>> of the servers running on it anywhere from a few minutes to half  
>> an hour before one of them crashes.  Oddly enough, with this  
>> environment it crashes faster on an emtpy file system than one  
>> with 500,000 files.
>> I repeated this test with the latest HEAD version from trunk, and  
>> that didn't seem to make any difference.
>> I'll try the mutex suggestion next.
>> -Phil
>> Sam Lang wrote:
>>>
>>> On Feb 20, 2007, at 11:32 AM, Pete Wyckoff wrote:
>>>
>>>> pcarns at wastedcycles.org wrote on Tue, 20 Feb 2007 07:29 -0500:
>>>>
>>>>> dbpf-dspace.c:1371
>>>>> assert(!dbpf_op_queue_empty(dbpf_completion_queue_array  
>>>>> [context_id]));
>>>>>
>>>>> According to the stack trace, this test() call followed a
>>>>> trove_dspace_iterate_handles() call within the
>>>>> trove_check_handle_ranges() function.  This is part of the  
>>>>> logic on
>>>>> startup that scans all of the handles in the storage space to   
>>>>> update the
>>>>> list of available/used handles in trove-handle-mgmt.
>>>>
>>>>
>>>>
>>>> Another thought for Sam, who knows this code better.
>>>>
>>>> (1) DBPF_COMPLETION_START modifies cur_op->op.state without  
>>>> holding  the
>>>> dbpf_completion_queue_array_mutex[cid] mutex.  Then it grabs the
>>>> mutex and puts the op on the completion array.
>>>>
>>>> (2) dbpf_dspace_test grabs that mutex, looks at op.state, then  
>>>> asserts
>>>> that the queue must not be empty.
>>>>
>>>> Perhaps (1) modifies the state but doesn't get around to putting it
>>>> on the completion array.  (Possibly because the lock is held in
>>>> (2).)
>>>>
>>> Good point Pete.  Given that this seems to be race Phil is  
>>> seeing,  your theory seems more likely.
>>>
>>>> Maybe (1) should put the op on the array before modifying its  
>>>> state,
>>>> and hold the array mutex the whole time.  I'm not sure what kind of
>>>> locking rules are involved between the mutex in the op and the  
>>>> mutex
>>>> on the completion array, though.  Or what else might break with  
>>>> such
>>>> a change.
>>>
>>>
>>>
>>> I don't think there should be any problems with doing this.  It   
>>> probably doesn't matter when the op is added to the completion  
>>> queue  (before or after its state gets changed), just that the  
>>> completion  queue's mutex gets locked before either (at the top  
>>> of  DBPF_COMPLETION_START).  I wonder if Phil could make this  
>>> change and  run his tests again.
>>>
>>>>
>>>>         -- Pete
>>>> _______________________________________________
>>>> Pvfs2-developers mailing list
>>>> Pvfs2-developers at beowulf-underground.org
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2- 
>>>> developers
>>>>
>>>
>



More information about the Pvfs2-developers mailing list