[Pvfs2-developers] bmi testcontext/testunexpected

Rob Ross rross at mcs.anl.gov
Tue Jan 6 18:03:28 EST 2009


I think if we had this alternative design and one wanted to have  
different priorities, one would look for messages under different  
contexts as you say. But when you don't care about priority, it would  
be nice to be able to get everything in one call.

Rob

On Jan 6, 2009, at 4:57 PM, Sam Lang wrote:

>
> Changing the API as you describe would actually bring back the  
> original problem.  As is, the BMI_tcp_testcontext call knows that  
> there are unexpected messages waiting, so it returns immediately  
> (expecting a call to testunexpected to follow).  This is a specific  
> policy hard-coded in the tcp method.
>
> With just a single testcontext call and all expected and unexpected  
> messages going to that context, the tcp code would have to put all  
> the unexpected messages at the top of the context to give them  
> priority.  This would fix the particular problem that Nawab has, but  
> its still dictating policy (which messages get priority) from within  
> the particular BMI method.
>
> I agree that forcing the application to define the policy (with  
> threads or timeouts) is moving the problem elsewhere, but its moving  
> the problem to where it belongs.  Its our pvfs server that wants  
> unexpected messages to have priority, the bmi code itself shouldn't  
> dictate that priority.  We could define interfaces to BMI that allow  
> the policy to be set, but that's even further from where we are now.
>
> -sam
>
> On Jan 6, 2009, at 2:52 PM, Rob Ross wrote:
>
>> Yeah a special named context for unexpected message would be a  
>> clean way to have done things... -- Rob
>>
>> On Jan 6, 2009, at 2:49 PM, Phil Carns wrote:
>>
>>> Yeah, I don't particularly like adding special cases either.
>>>
>>> I feel like making the consumer play with timeouts or use an extra  
>>> thread would be just as much of a hack/workaround, though.  Its  
>>> just moving the problem elsewhere.
>>>
>>> Fundamentally it seems more like a BMI API flaw.  It would have  
>>> made more sense (for example) if unexpected messages were assigned  
>>> to a specific context and the testunexpected() and testcontext()  
>>> functions were combined.  The consumer could then use a single  
>>> test call to retrieve both unexpected and normal messages at once  
>>> if they are in the same context (as in the pvfs2-server use  
>>> case).  Testing on a different context would ignore the presence  
>>> of unexpected messages (as in the problem triggering use case here).
>>>
>>> There are other ways to deal with it, that's just an example.  We  
>>> just need the API to better express the intention of the caller  
>>> (preferably in one function) so that BMI doesn't have to optimize  
>>> by guessing about what else is going on.
>>>
>>> That is more work than just adding a flag, though :)  It probably  
>>> depends on if we think the use case is going to be around long  
>>> enough to justify tweaking the API.
>>>
>>> -Phil
>>>
>>> Sam Lang wrote:
>>>> I've committed the set_info fix for this.  I'm not crazy about  
>>>> it, but it should work for now.  In the long term, we should  
>>>> probably move away from method specific hacks like this.  I.e. it  
>>>> should be up to the API consumer (our server) to adjust timeouts  
>>>> or call testunexpected in a separate thread.
>>>> Nawab, in the zoidfs init code after initializing BMI you need to  
>>>> call:
>>>> int check = 0;
>>>> BMI_set_info(0, BMI_TCP_CHECK_UNEXPECTED, &check);
>>>> -sam
>>>> On Dec 23, 2008, at 2:01 PM, Phil Carns wrote:
>>>>> Sam Lang wrote:
>>>>>> Hi All,
>>>>>> I think Nawab has found a bug (or untested code path) in the  
>>>>>> BMI tcp method.  He's running a daemon that both receives  
>>>>>> unexpected requests (as a server), and receives expected  
>>>>>> responses (as a client).
>>>>>> In the BMI_testcontext call, if there aren't any completed  
>>>>>> (expected) operations, and there are completed unexpected  
>>>>>> receives, we return immediately, assuming that  
>>>>>> BMI_testunexpected will be called in turn.  I think the idea  
>>>>>> here is that we want to keep our latency down for unexpected  
>>>>>> messages, instead of doing work on expected messages while  
>>>>>> unexpected messages are waiting in the hopper.  But the daemon  
>>>>>> is single threaded, and making blocking PVFS_sys_* calls, so we  
>>>>>> essentially spin forever calling BMI_testcontext over and over.
>>>>>> I'm not sure of the best way to fix this.  Easy fixes would be  
>>>>>> to remove the check for completed unexpected receives, and/or  
>>>>>> do tcp_do_work for a shorter timeout.
>>>>>> It seems like we have a special case for blocking PVFS_sys_*  
>>>>>> calls.  We want to ignore unexpected receives just in that  
>>>>>> case, and actually call tcp_do_work.  In other contexts, I  
>>>>>> think we want the behavior that we have now, where we assume  
>>>>>> that a BMI_testunexpected call will follow a BMI_testcontext  
>>>>>> call.  We could modify the testcontext call to take a separate  
>>>>>> parameter, but that seems messy.  We might also be able to  
>>>>>> handle this with separate BMI contexts somehow...
>>>>>
>>>>> I haven't dug in the code yet to see if I see any more elegant  
>>>>> way to handle it, but I wanted to mention that if you want to  
>>>>> add a special flag to toggle the behavior, it might be better to  
>>>>> just set it globally with the set_info() function rather than  
>>>>> modifying the testcontext() api.  That way you don't have to  
>>>>> change any of the other BMI methods. There are already a couple  
>>>>> of similar set_info() calls to toggle BMI behavior for different  
>>>>> use cases.
>>>>>
>>>>> -Phil
>>>
>>> _______________________________________________
>>> Pvfs2-developers mailing list
>>> Pvfs2-developers at beowulf-underground.org
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>



More information about the Pvfs2-developers mailing list