[Pvfs2-developers] Initializing a BMI method

Scott Atchley atchley at myri.com
Thu Sep 21 14:32:09 EDT 2006


On Sep 21, 2006, at 2:11 PM, Pete Wyckoff wrote:

> atchley at myri.com wrote on Thu, 21 Sep 2006 13:40 -0400:
>> And the non-NULL will be a mx://... string, no?
>
> No, this is the bmi_method_addr that comes back from your lookup
> method.  You cast it up into your own private address structure,
> and do not have to parse the string again.

Ok.

>> As for listening on all boards, that is not supported by MX directly
>> (a MX endpoint is specific to a NIC). If this is a requirement of
>> PVFS (to be to use multiple Myricom NICs), then we have a couple of
>> options. First, can PVFS open two BMI methods of the same type? If
>> so, then we just have to ensure that all bmi_mx state is local to
>> each process (no globals, etc.). If BMI cannot open two of the same
>> type, then I will have to manage two (or more) MX endpoints in  
>> bmi_mx.
>
> I'm not sure if that's possible.  BMI does fully support
> "multi-homed" networks on _different_ types (like IB or TCP), but
> I don't think it can do what you're thinking, nor would you want it
> to do that.

Agreed. :-)

> Consider BMI_test_unexpected.  Would you want BMI to call that
> separately for each board, or can you instead check each board
> yourself?  BMI currently expects the latter.  Also there's some
> features that would likely be global across all BMI_mx instances,
> like a lookup table that would convert a peer BMI address into a
> local board number.  That would be a bit unclean.

Definitely.

>> That then raises the issue of whether the NICs are in the same fabric
>> or in disjoint fabrics. If the same, then I can send (and receive)
>> using either one and we will need to consider some form of striping
>> over the NICs to ensure that we maximize utilization of both. If they
>> are disjoint, I will need to maintain separate peer lists for each
>> and determine which NIC needs to send and receive for a given peer.
>
> Fun.
>
>> Using more than one board definitely complicates matters. ;-)
>> Initially, I will support one board to get things going.
>
> Good plan.  Then figure out later if somebody wants to do a setup
> like this.  But keep it in mind as you think about the design
> decisions.
>
>> I am not worried about peer (i.e. server) state at initialize time.
>> Sending to others is not an issue as long as the URI is passed as
>> part of the send context. Does the client also get a listen_addr
>> string or not? I would assume not. If so, I am then free to open any
>> endpoint I want and then I would pass that info to the server before
>> sending my first sendunexpected message (i.e. in my method's connect
>> request message).
>
> That's how you tell if you're a server:  listen_addr == NULL => I am
> a client.  Sounds like this is the easy side of the communication.
>
>> If the client does not get a listen_addr URI and if the machine has
>> mutiple NICs, I will not know which one to open. I could have a
>> #define that compiles in the board number but it would require that
>> all machines use the same board. Any suggestions?
>
> Don't open anything until somebody calls BMI_mx_sendunexpected().
> You could probe to make sure a NIC exists, but don't commit to using
> it until you need to do so.  Or punt like on the server side.  IB
> assumes only a single NIC for now, too.  Most people aren't crazy
> enough to use more than one of these things in a machine anyway.
>
> 		-- Pete

Initially, I will do the easy thing and set a default of board 0  
(changeable via a #define). Later on if needed, I could open each NIC  
and try to connect to the first peer (server). If it succeeds, use  
only that NIC after that. It would still require all traffic to use  
that one NIC.

Scott


More information about the Pvfs2-developers mailing list