[Pvfs2-developers] terminating state machines

Sam Lang slang at mcs.anl.gov
Wed Jul 26 18:09:45 EDT 2006


On Jul 26, 2006, at 5:06 PM, Phil Carns wrote:

>
>> I don't see why the two have to be dependent for this to work.   
>> Do  you mean by the parent posting a job, the state machine  
>> stepping code  would handling the actual posting?  I was assuming  
>> that the parent  state action could just call  
>> job_concurrent_sm_post (or whatever its  called).
>> Could it be similar to the request scheduler job posting code?   
>> The  parent state action could call job_concurrent_sm_post with an  
>> array  of the child sms, which just calls sm_post and adds the  
>> parent sm and  its array to an operation queue.  Then a  
>> job_concurrent_sm_test  function could test for completion of a  
>> parent sm by looking at all  the sms in the array to see if they  
>> completed.  The job_testcontext  code would have to be modified of  
>> course (maybe rework the  do_one_test_cycle_req_sched function to  
>> also test parent sm jobs),  but all of that still seems to be  
>> independent of the state machine  code (i.e. someone could use the  
>> state machine code separately and  drive state machines using  
>> something other than the job framework).   I don't know if all  
>> that makes sense in the context of the changes  you've made, but  
>> that's what I had in mind when I suggested posting a  job for the  
>> parent.
>
> I think I follow what you are describing, but I am not entirely  
> sure. If so, I think there is one advantage to the approach that  
> Walt has been hashing out thus far.  I think that what Walt is  
> describing is event-driven, in a sense.  No one has to actively  
> look to see if all of the children have finished.  Instead, the  
> children each send notification (by calling a release function or  
> manually decrementing a counter) in their completion function, with  
> the parent eventually getting a single notification (representing  
> all of the children) through the existing job completion queue  
> mechanism.

I think I'm getting voted down here, so I should probably just  
shutup, but I don't think in practice we're going to have that many  
child state machines that iterating through the list is at all  
costly.  I'm arguing for simpler mechanisms that fit in with the job  
subsystem over something more fancy and possibly slightly better  
performing.

>
> I think that the way that you describe would work fine too, but it  
> would require a little more active work to check the status of the  
> array of child SMs and would require more code to keep track of them.

Probably a bit more code yes, but it seems cleaner than keeping  
around backpointers and checking for parents.  Instead of driving all  
state machines from one place, this event notification scheme  
essentially replaces the last child state machine with the parent,  
which seems like a bit of hack and harder to debug.

-sam

>
> I think you are right though, that you could pull off your version  
> without the the children actually having to make a job_* call.
>
> -Phil
>



More information about the Pvfs2-developers mailing list