[Pvfs2-developers] terminating state machines

Sam Lang slang at mcs.anl.gov
Thu Jul 27 11:37:44 EDT 2006


On Jul 27, 2006, at 10:16 AM, Phil Carns wrote:

>
>> Hmm...I had been thinking about a flow implementation that used  
>> the  new concurrent state machine code...it sounds like that's a  
>> bad idea  because the testing and restarting would take too long  
>> to switch  between bmi and trove?  We use the post/test model  
>> through pvfs2  though, so maybe I don't understand the issue.
>
> I don't think that is bad idea.  There were really two seperate but  
> related problems in one of the older flow protocol implementations,  
> I can try to describe them a little more here if I can remember:
>
> - explicitly tracking and testing each trove and bmi operation: It  
> basically kept arrays that listed pending trove and bmi ops, and  
> would call testsome() to service them.  This was a problem because  
> the time it took to keep running up and down those arrays (when  
> building them at the flow level, or when testing them at the trove/ 
> bmi level).  The solution is to just use testcontext() and let  
> trove/bmi tell you when something finishes without managing extra  
> state.
>
> - thread switch time: the architecture here was set up at one time  
> to have one thread pushing the test functions for bmi, another  
> thread pushing the test functions for trove, while another thread  
> was processing the flow and posting new operations.  The problem  
> here is that it (at the time) took too long to jump between the  
> "pushing" threads and the "processing" thread when an operation  
> finished that should trigger progress on the flow. This led to the  
> thread-mgr.c code and associated callbacks.  The callbacks actually  
> drive the flow progress and post new operations.  That means that  
> the same thread that pushes testcontext() gets to trigger the next  
> post, without waiting on the latency of waking up a different  
> thread to do something (using condition variable etc.).  I managed  
> to reuse the thread-mgr for the job code as well, so that one  
> testcontext() call triggers callbacks to both the job and flow  
> interfaces.
>
> I don't think either of the above issues precludes different flow  
> protocol implementations, and they are really kind of orthogonal to  
> whether state machines are used or not.  The first issue is solved  
> just by using testcontext() rather than manually tracking operations.
>
> The second issue could be solved in a variety of ways, some of  
> which may  be better than what we have now.  The callback approach  
> is effecient enough, but is hard to debug.  Of course it is also  
> possible that the thread switch (ie. condition signal) latency is  
> low enough nowadays that you don't even need to worry about it  
> anymore.  I last looked at this problem before NPTL arrived on the  
> scene.
>
> At any rate I think a state machine based flow protocol could dodge  
> issue #2 by either:
> - lucking out with a faster modern thread implementation
> - being smarter about how thread work is divided up
> - using callbacks as we do now, and making the state machine  
> mechanism thread safe so that it can be driven directly from those  
> callbacks rather than from a testcontext() work loop
>
> On a related note, it is important to remember that trove has its  
> own internal thread also- so on the trove push side (depending on  
> your design) you could have to worry about a chain of 2 threads  
> that have to be woken up to get something done at completion time.   
> The trove part of that chain can't be avoided without changing the  
> API.
>
> Sorry about the tangent here, but I figured I may as well share  
> some warnings about things to look out for here.  I think it would  
> be good to have a cleaner flow protocol implementation.
>

Thanks for the detailed explanation Phil.  I hadn't thought about the  
context switches that might slow down flow.  I was primarily thinking  
of something that would be cleaner, and easier to modify and test for  
different scenarios.  If at some point I get around to playing with a  
flow impl that uses the concurrent state machine framework, I'll open  
up the discussion again to avoid any of the pitfalls you described.

-sam

>>> I think I'm lost now.  What do you mean by replace?  The states  
>>> are  still isolated, jobs trigger the transitions, only one state  
>>> action  gets executed at a time, there still may be a time gap  
>>> between  completion of any given child and when the parent picks  
>>> up  processing again, and there are still frames.  I think both   
>>> approaches will look the same when running unless I missed   
>>> something.  If Walt puts a longjmp() in there we can both hit  
>>> him  over the head.
>>>
>> Heh.  Don't give him ideas! ;-)
>> I was operating under the constraint that a state machine can  
>> only  post a job for itself.  If I understand the current plan  
>> correctly,  using job_null in the child state machine to post a  
>> job for the  parent breaks that constraint, and so in some sense  
>> is a replace (the  job_null actually takes the parent smcb  
>> pointer).  I think you're  probably right that its not a big  
>> difference either way, its just  cleaner in my head to only have  
>> state machines posting jobs for  themselves.
>
> I see what you are saying.  I guess it depends on how you look at  
> it.  I had kind of started thinking of the jobs as a signalling  
> mechanism since they are the construct that "signals" as state  
> machine to make its next transition.  The job_null() approach just  
> makes it so that a child state machine is what triggers this  
> particular signal, rather than a bmi/trove/dev/req_sched/flow  
> component.  I know this is a change in the model and adds a  
> dependency that wasn't previously there, but at least job_null() is  
> just a few dozen lines of code.  If someone reuses the SM code  
> elsewhere, I would guess that is one of the more minor worries  
> considering that they would need a whole new mechanism (other than  
> the job api) to motivate all of the transitions anyway.
>
>> Walt probably got more discussion than he bargained for, but at  
>> the  least, lively discussion keeps me awake in the afternoon ;-).
>
> Heh- same here :)
>
> -Phil
>



More information about the Pvfs2-developers mailing list