[Pvfs2-developers] timeouts in the kmod
Sam Lang
slang at mcs.anl.gov
Tue Nov 6 23:58:21 EST 2007
On Nov 6, 2007, at 10:13 PM, Murali Vilayannur wrote:
> Sam,
> The behavior that you describe is definitely correct, but is there any
> particular reason
> we expect downcalls/requests to be stuck for a long period of time in
> client-core
> and not be written back to the device...
> If they occur due to coding errors, we definitely want the schedule()
> in there so that we know that hung processes in "D" state are waiting
> for client-core to respond back..
> I feel that keeping timeouts of the kernel module is clean for the
> non-error cases since we can push all the timeout stuff in pvfs2
> client-core and the user-level libraries...
> If I recall correctly, In the past we have had problems where flow
> timeouts occur because
> servers were taking a long time to complete all requests..
> The last thing we want is another timeout to add to the mix..
> Am I missing something?
I guess I was thinking that there's already a timeout. If the first
request fails, the second attempt has a 20 second (10 secs in debug
mode) timeout with schedule_timeout. I was thinking schedule_timeout
could be called instead of schedule with a timeout that exceeds all
the timeouts in client-core. It wouldn't be a big change, and would
prevent that off-chance of some process going out to lunch just
because we have a bug in the client that doesn't handle responses and
timeouts properly. Not a big deal, really.
-sam
> thanks
> Murali
>
> On 11/5/07, Sam Lang <slang at mcs.anl.gov> wrote:
>>
>> Hi All,
>>
>> It looks like timeouts in a kmod request to the client daemon only
>> get a timeout if its not the first attempt, which looks to only occur
>> if the client deamon falls over and gets restarted. Instead of
>> calling schedule() in that first wait_for_downcall, why not call
>> schedule_timeout with some long-ish timeout that exceeds the timeouts
>> of the client daemon?
>>
>> I just wonder what happens to scheduled procs on downcalls if by some
>> coding error the client stays alive but never returns a downcall. It
>> looks like the process never gets woken up...
>>
>> -sam
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>
More information about the Pvfs2-developers
mailing list