[Pvfs2-developers] terminating state machines
Walter B. Ligon III
walt at clemson.edu
Thu Jul 27 10:56:40 EDT 2006
Sam Lang wrote:
>
> On Jul 26, 2006, at 6:16 PM, Phil Carns wrote:
>
>>> I think I'm getting voted down here, so I should probably just
>>> shutup, but I don't think in practice we're going to have that many
>>> child state machines that iterating through the list is at all
>>> costly. I'm arguing for simpler mechanisms that fit in with the
>>> job subsystem over something more fancy and possibly slightly
>>> better performing.
>>
>>
>> Well, as far as the number of SMs goes, I would rather not risk it.
>> I still hope this is lightweight enough that we could eventually use
>> it in more places that would generate a lot of children (like a
>> re-architected sys-io implementation), though I don't know if that
>> will pan out in practice. I got bitten by a similar assumption in
>> the flow protocol- it used to track all of its posted operations for
>> testing rather than relying on someone to notify it of completion.
>> Admittedly the flow protocol is a more obvious case and I should have
>> known better, but at the time it seemed reasonable :)
>>
>
> Hmm...I had been thinking about a flow implementation that used the new
> concurrent state machine code...it sounds like that's a bad idea
> because the testing and restarting would take too long to switch
> between bmi and trove? We use the post/test model through pvfs2
> though, so maybe I don't understand the issue.
>
>>>> I think that the way that you describe would work fine too, but it
>>>> would require a little more active work to check the status of the
>>>> array of child SMs and would require more code to keep track of them.
>>
>>
>>> Probably a bit more code yes, but it seems cleaner than keeping
>>> around backpointers and checking for parents. Instead of driving
>>> all state machines from one place, this event notification scheme
>>> essentially replaces the last child state machine with the parent,
>>> which seems like a bit of hack and harder to debug.
>>
>>
>> I think I'm lost now. What do you mean by replace? The states are
>> still isolated, jobs trigger the transitions, only one state action
>> gets executed at a time, there still may be a time gap between
>> completion of any given child and when the parent picks up processing
>> again, and there are still frames. I think both approaches will look
>> the same when running unless I missed something. If Walt puts a
>> longjmp() in there we can both hit him over the head.
>>
> Heh. Don't give him ideas! ;-)
>
> I was operating under the constraint that a state machine can only post
> a job for itself. If I understand the current plan correctly, using
> job_null in the child state machine to post a job for the parent breaks
> that constraint, and so in some sense is a replace (the job_null
> actually takes the parent smcb pointer). I think you're probably right
> that its not a big difference either way, its just cleaner in my head
> to only have state machines posting jobs for themselves.
>
>> I think having a pointer to the parent actually improves debugability
>> (though I'm not sure this approach actually requires it, all you
>> really need is either a job descriptor or a pointer to a counter).
>> If I have a state machine that does something bad or gets stuck it
>> would be nice to be able to work backwards to find out who invoked
>> it, without having to search for it in a seperate data structure.
>>
>> I don't mean to keep struggling with this issue- I honestly think
>> that both approaches are pretty good, and if Walt implements it the
>> way I think he is going to, then 95% of developers won't notice the
>> difference anyway. At this point I am mostly hammering away to make
>> sure I am not missing a larger issue...
>
>
> Walt probably got more discussion than he bargained for, but at the
> least, lively discussion keeps me awake in the afternoon ;-).
>
> -sam
>
>>
>> -Phil
>>
Good discussion. Phil has convinced me the level of dependency is low,
and unless I completely misunderstand Sam, the complexity of the parent
pointer/job_null approach is a lot less than the alternative, and I like
low complexity. I also think debugging will be simpler. So that's
where I'm going.
I'll hae to think of other topics to get you guys going form time to
time! ;-)
Now off to figure out a way to use setjmp/longjmp in my implementation!
Walt
--
Dr. Walter B. Ligon III
Associate Professor
ECE Department
Clemson University
More information about the Pvfs2-developers
mailing list