[Pvfs2-developers] bmi multiple address endpoints

Sam Lang slang at mcs.anl.gov
Tue Nov 6 15:27:41 EST 2007


On Nov 6, 2007, at 1:41 PM, Pete Wyckoff wrote:

> slang at mcs.anl.gov wrote on Tue, 06 Nov 2007 10:50 -0600:
>> The attached patch implements BMI multiple address endpoints that we
>> talked about some time ago.  To refresh everyone's memory, this
>> allows a set of addresses to be specified:
>>
>> tcp://host1:3334/pvfs2-fs,tcp://host2:3335/pvfs2-fs,tcp::// 
>> host3:3336/
>> pvfs2-fs
>>
>> In the config file for a given storage endpoint.  The BMI code
>> manages the endpoint, setting the currently used address to the first
>> one in the list.  On message failure, the endpoint is transitioned to
>> point to the next address in the list.  This continues in a round-
>> robin fashion.
>
> This is good stuff.  I'd like to help review it.  Can you do some
> trivial things first to make that easier?
>
> 1.
>> +    struct bmi_endpoint_ref_s *newref;
> [..]
>> +    /* haven't seen any of these addresses before, add a new  
>> endpoint */
>> +    newref = malloc(sizeof(struct bmi_addr_ref_s));
>
> Please change all malloc(sizeof()) and memset(,,sizeof()) to use the
> variable name, not its type.  The bug you did above (and other
> places) is way too common.  We just have to stop doing that.  Like
> this instead:
>
> 	newref = malloc(sizeof(*newref));
>
> 2.
> There's a bunch of stuff that seems out of place.  Can you check
> that in or push it to the side so we can concentrate on the core?
> 179 kB is a big patch.  :)

The bmi-addr.[ch] files are new files, and the critical part of the  
patch.  They obsolete the ref_st calls, so the reference-list.[ch]  
files have been removed in the patch.  That may be adding to the  
number of lines.

>
> Renaming all the method_ops is a big uninteresting part.

Yeah, I got tired of calling them over and over with the BMI_method_  
tagged on the front.  I can try to pull that stuff out.

>
> Adding bmi_ in front of everything too, but it's too hard to rip
> that out mid-patch now.  Random whitespace fixes too.  Good changes,
> just hard to read.
>
> 3.
> Some trivial bugs.
>
>> +    ref->current = ref->current + 1 % ref->count;
>
> Check your precedence table.
>
>> + * (C) 2001 Clemson University and The University of Chicago
>
> New files go back in time.
>
> Can you put some comments above each of the three new structs in
> bmi-addr.c?  I keep getting confused on which is the old-style
> addr and which is the comma-separated list.  And what the various
> "link" and "refs" fields point to.

Sure thing.

>
>> I've done some basic testing, but there's still more to do.  The
>> client IO state machine is a bear, and testing all the cases where
>> things could failover (requests, flows, acks, etc.) is going to take
>> some more work.  I wanted to get the patch out there to allow others
>> to provide feedback.
>
> Yeah, totally.  But it can be made to work.
>
> What did we decide do with mixed method usage?  The old semantic
> was "ib://foo:2345/pvfs2-fs,tcp://foo:2347/pvfs2-fs" means try to
> use IB, but if you don't have an IB nic, switch to TCP.  I agree we
> decided that was less interesting.  Do we just add docs that say
> that this comma is now for multi-pathing?  If people try this
> example with the new code, it will flip from IB to TCP at every
> timeout.  The old behavior was to stick with the first one where
> you had the hardware.  In other words, probably some docs somewhere
> should be added to this patch.

Right, I thought we had decided to go the multi-path route.  I guess  
there could be a config option that would set the flipping from round- 
robin to try-once, giving the old behavior.

>
> On the server side, the ,-separated addresses mean "listen on all
> these interfaces".  What do servers do now when they see your
> tcp://host1,tcp://host2 example string above?  Looks like they would
> fail to listen on anything (host1 can't bind to host2 address?).
> This has to be in the fs.conf as the Alias string for each server so
> that clients can find the IO servers, not just in the pvfs2tab to
> find the config server.

Yeah that formatting isn't going to work.  I wanted to keep it  
simple, but I guess that's not possible.  Should we separate fallback  
addresses with ';' instead?

tcp://host1:3331,ib://host1:3335;tcp://host2:3332,ib://host2:3336

Something like that?
-sam

>
> 		-- Pete
>



More information about the Pvfs2-developers mailing list