[Pvfs2-developers] fix BMI multiplexing of multiple methods

Sam Lang slang at mcs.anl.gov
Wed Jan 7 17:06:06 EST 2009


Hi All,

Right now if multiple methods are enabled in BMI, we tend to get poor  
performance from the "fast" network, because BMI_testcontext iterates  
through all the active methods calling testcontext for each one.  It  
tries to be smart about which methods get scheduled ;-) to prevent  
starvation, but it treats all the methods fairly, which tends to make  
tcp (the slow one) hog the time spent in testcontext.  I have a few  
ideas for this, so I'll go ahead and propose them and let you all  
shoot them down or propose others.

Option CALLBACK:  Instead of returning completion as a list in  
testcontext, we allow a BMI context to be constructed with a callback,  
and on completion of operations, the callback is called.  This allows  
each method to drive its own operations, and notify the consumer of  
completion immediately.  There would still need to be a testcontext  
call for methods that only service operations during that call.  The  
changes might not be that significant, the BMI_open_context call could  
just take an extra parameter that was the callback function.  If the  
parameter is null, we just use the completion list as before.

Option CONTEXT:  Require separate contexts for separate methods.  This  
pushes the problem up to the application, probably not where it  
belongs, since active methods are opaque from the BMI api.

Option POLL_PLAN:  Modify the construct_poll_plan function in bmi that  
already tries to be fair, so that its aware of the performance  
discrepancy between methods.  Maybe it can just skip tcp every other  
time for example.  This is probably the easiest, since it doesn't  
require API changes and the like.

-sam


More information about the Pvfs2-developers mailing list