[Pvfs2-developers] server crash on startup with millions of files
Phil Carns
pcarns at wastedcycles.org
Fri Feb 23 09:15:10 EST 2007
I tried out Pete's suggestion and all of the nodes ran overnight without
any trouble, about 7 million scans per server so far.
The only modification was to the DBPF_COMPLETION_START macro: getting
the queue mutex first and also doing a queue add before touching the state.
I think the change is safe to commit if it looks ok on your end...
-Phil
Phil Carns wrote:
> It ended up taking a little work to get another environment to trigger
> this reliably, but I think I have something now.
>
> I modified the iterate_handles() function a bit so that it keeps
> scanning over and over again indefinitely rather than letting the server
> start up. This forces the code path in question without having to
> restart the servers. Using this setup I'm able to trigger it on an
> empty 8 node file system, but I have to leave all of the servers running
> on it anywhere from a few minutes to half an hour before one of them
> crashes. Oddly enough, with this environment it crashes faster on an
> emtpy file system than one with 500,000 files.
>
> I repeated this test with the latest HEAD version from trunk, and that
> didn't seem to make any difference.
>
> I'll try the mutex suggestion next.
>
> -Phil
>
> Sam Lang wrote:
>
>>
>> On Feb 20, 2007, at 11:32 AM, Pete Wyckoff wrote:
>>
>>> pcarns at wastedcycles.org wrote on Tue, 20 Feb 2007 07:29 -0500:
>>>
>>>> dbpf-dspace.c:1371
>>>> assert(!dbpf_op_queue_empty(dbpf_completion_queue_array [context_id]));
>>>>
>>>> According to the stack trace, this test() call followed a
>>>> trove_dspace_iterate_handles() call within the
>>>> trove_check_handle_ranges() function. This is part of the logic on
>>>> startup that scans all of the handles in the storage space to
>>>> update the
>>>> list of available/used handles in trove-handle-mgmt.
>>>
>>>
>>>
>>> Another thought for Sam, who knows this code better.
>>>
>>> (1) DBPF_COMPLETION_START modifies cur_op->op.state without holding the
>>> dbpf_completion_queue_array_mutex[cid] mutex. Then it grabs the
>>> mutex and puts the op on the completion array.
>>>
>>> (2) dbpf_dspace_test grabs that mutex, looks at op.state, then asserts
>>> that the queue must not be empty.
>>>
>>> Perhaps (1) modifies the state but doesn't get around to putting it
>>> on the completion array. (Possibly because the lock is held in
>>> (2).)
>>>
>> Good point Pete. Given that this seems to be race Phil is seeing,
>> your theory seems more likely.
>>
>>> Maybe (1) should put the op on the array before modifying its state,
>>> and hold the array mutex the whole time. I'm not sure what kind of
>>> locking rules are involved between the mutex in the op and the mutex
>>> on the completion array, though. Or what else might break with such
>>> a change.
>>
>>
>>
>> I don't think there should be any problems with doing this. It
>> probably doesn't matter when the op is added to the completion queue
>> (before or after its state gets changed), just that the completion
>> queue's mutex gets locked before either (at the top of
>> DBPF_COMPLETION_START). I wonder if Phil could make this change and
>> run his tests again.
>>
>>>
>>> -- Pete
>>> _______________________________________________
>>> Pvfs2-developers mailing list
>>> Pvfs2-developers at beowulf-underground.org
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>>
>>
>
>
More information about the Pvfs2-developers
mailing list