[Pvfs2-developers] bmi multiple address endpoints
Sam Lang
slang at mcs.anl.gov
Tue Nov 6 15:27:41 EST 2007
On Nov 6, 2007, at 1:41 PM, Pete Wyckoff wrote:
> slang at mcs.anl.gov wrote on Tue, 06 Nov 2007 10:50 -0600:
>> The attached patch implements BMI multiple address endpoints that we
>> talked about some time ago. To refresh everyone's memory, this
>> allows a set of addresses to be specified:
>>
>> tcp://host1:3334/pvfs2-fs,tcp://host2:3335/pvfs2-fs,tcp:://
>> host3:3336/
>> pvfs2-fs
>>
>> In the config file for a given storage endpoint. The BMI code
>> manages the endpoint, setting the currently used address to the first
>> one in the list. On message failure, the endpoint is transitioned to
>> point to the next address in the list. This continues in a round-
>> robin fashion.
>
> This is good stuff. I'd like to help review it. Can you do some
> trivial things first to make that easier?
>
> 1.
>> + struct bmi_endpoint_ref_s *newref;
> [..]
>> + /* haven't seen any of these addresses before, add a new
>> endpoint */
>> + newref = malloc(sizeof(struct bmi_addr_ref_s));
>
> Please change all malloc(sizeof()) and memset(,,sizeof()) to use the
> variable name, not its type. The bug you did above (and other
> places) is way too common. We just have to stop doing that. Like
> this instead:
>
> newref = malloc(sizeof(*newref));
>
> 2.
> There's a bunch of stuff that seems out of place. Can you check
> that in or push it to the side so we can concentrate on the core?
> 179 kB is a big patch. :)
The bmi-addr.[ch] files are new files, and the critical part of the
patch. They obsolete the ref_st calls, so the reference-list.[ch]
files have been removed in the patch. That may be adding to the
number of lines.
>
> Renaming all the method_ops is a big uninteresting part.
Yeah, I got tired of calling them over and over with the BMI_method_
tagged on the front. I can try to pull that stuff out.
>
> Adding bmi_ in front of everything too, but it's too hard to rip
> that out mid-patch now. Random whitespace fixes too. Good changes,
> just hard to read.
>
> 3.
> Some trivial bugs.
>
>> + ref->current = ref->current + 1 % ref->count;
>
> Check your precedence table.
>
>> + * (C) 2001 Clemson University and The University of Chicago
>
> New files go back in time.
>
> Can you put some comments above each of the three new structs in
> bmi-addr.c? I keep getting confused on which is the old-style
> addr and which is the comma-separated list. And what the various
> "link" and "refs" fields point to.
Sure thing.
>
>> I've done some basic testing, but there's still more to do. The
>> client IO state machine is a bear, and testing all the cases where
>> things could failover (requests, flows, acks, etc.) is going to take
>> some more work. I wanted to get the patch out there to allow others
>> to provide feedback.
>
> Yeah, totally. But it can be made to work.
>
> What did we decide do with mixed method usage? The old semantic
> was "ib://foo:2345/pvfs2-fs,tcp://foo:2347/pvfs2-fs" means try to
> use IB, but if you don't have an IB nic, switch to TCP. I agree we
> decided that was less interesting. Do we just add docs that say
> that this comma is now for multi-pathing? If people try this
> example with the new code, it will flip from IB to TCP at every
> timeout. The old behavior was to stick with the first one where
> you had the hardware. In other words, probably some docs somewhere
> should be added to this patch.
Right, I thought we had decided to go the multi-path route. I guess
there could be a config option that would set the flipping from round-
robin to try-once, giving the old behavior.
>
> On the server side, the ,-separated addresses mean "listen on all
> these interfaces". What do servers do now when they see your
> tcp://host1,tcp://host2 example string above? Looks like they would
> fail to listen on anything (host1 can't bind to host2 address?).
> This has to be in the fs.conf as the Alias string for each server so
> that clients can find the IO servers, not just in the pvfs2tab to
> find the config server.
Yeah that formatting isn't going to work. I wanted to keep it
simple, but I guess that's not possible. Should we separate fallback
addresses with ';' instead?
tcp://host1:3331,ib://host1:3335;tcp://host2:3332,ib://host2:3336
Something like that?
-sam
>
> -- Pete
>
More information about the Pvfs2-developers
mailing list