[Pvfs2-developers] terminating state machines

Sam Lang slang at mcs.anl.gov
Wed Jul 26 15:32:22 EDT 2006


On Jul 26, 2006, at 12:37 PM, Walter B. Ligon III wrote:

>
> OK, guys, I have another issue I want input on.  When child SMs  
> terminate they have to notify their parent.  The parent has to wait  
> for all the children to terminate.  So I've been thinking to use  
> the job subsystem for this: the parent would post a job to wait for  
> N children,
> and each child would post a job, the last one releasing the parent.
>
> Now I see two ways to implement this - one is to implement this  
> directly in the state machine code.  The parent simply stops  
> running (because it does not schedule a job yet returns DEFERRED).   
> Each child decrements a counter, and when it hits 0 the parent is  
> restarted.  This is a little ugly because the waiting parent is not  
> being held on any list or queue (up to now all waiting SMs are in  
> the job subsystem), also the last terminating child becomes the  
> parent as it starts executing the parent code.  Things can get  
> weird when one SM starts children that start children, and so on.
>
> Now the other way to implement this is with the job subsystem as I  
> suggested above.  Much cleaner except for one thing:  up to now the  
> state machine subsystem has had no dependency at all on the job  
> subsystem.  If we do it this way, this function only works with the  
> job system intact.  I'd prefer not to do this, but it does seem the  
> cleanest, most logical means.
>

I don't see why the two have to be dependent for this to work.  Do  
you mean by the parent posting a job, the state machine stepping code  
would handling the actual posting?  I was assuming that the parent  
state action could just call job_concurrent_sm_post (or whatever its  
called).

Could it be similar to the request scheduler job posting code?  The  
parent state action could call job_concurrent_sm_post with an array  
of the child sms, which just calls sm_post and adds the parent sm and  
its array to an operation queue.  Then a job_concurrent_sm_test  
function could test for completion of a parent sm by looking at all  
the sms in the array to see if they completed.  The job_testcontext  
code would have to be modified of course (maybe rework the  
do_one_test_cycle_req_sched function to also test parent sm jobs),  
but all of that still seems to be independent of the state machine  
code (i.e. someone could use the state machine code separately and  
drive state machines using something other than the job framework).   
I don't know if all that makes sense in the context of the changes  
you've made, but that's what I had in mind when I suggested posting a  
job for the parent.

-sam

> Comments?
>
> Walt
> -- 
> Dr. Walter B. Ligon III
> Associate Professor
> ECE Department
> Clemson University
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list