[Pvfs2-developers] Initializing a BMI method

Scott Atchley atchley at myri.com
Thu Sep 21 13:40:38 EDT 2006


On Sep 21, 2006, at 11:23 AM, Pete Wyckoff wrote:

> atchley at myri.com wrote on Thu, 21 Sep 2006 10:19 -0400:
>> If I use a URI of mx://hostname:board:endpoint for each peer, I am
>> wondering how the local bmi_mx gets its own name (in order to know
>> which NIC to use and which MX endpoint).
>>
>> Looking through bmi.c, activate_method() calls
>> BMI_meth_method_addr_lookup(listen_addr) before calling
>> BMI_meth_initialize(). The function activate_method() is called from
>> BMI_initialize() which indicates that listen_addr is "a comma
>> separated list of addresses to listen on for each method." Should I
>> assume that the following will happen to startup bmi_mx:
>>
>> new_addr = BMI_meth_method_addr_lookup("mx://
>> myhostname:myboard:myport");
>> ...
>> BMI_meth_initialize(new_addr, ...)
>>
>> Is this a correct assumption for the method_addr_lookup() and order
>> of operations?
>
> Yes on the assumption, thus BMI_mx_method_addr_lookup() should not
> look at any state that needs to be initialized in
> BMI_meth_initialize.  It just munges strings.  The IB implementation
> relies on a static variable to check if it has been initialized or
> not.
>
> I'm not thoroughly happy with this situation.  If it is a major
> problem for MX, speak up.  We can fix the API.

I do not foresee a problem yet, but if one comes up, I'll let you  
know. :-)

>> If so, then my method_addr_lookup() function has to check if bmi_mx
>> has been initialized before using any lists, locks, etc., no?
>
> In BMI_mx_initialize, do whatever basic NIC-independent setup you
> need to do.  If called with a non-NULL listen_addr, act as a server
> and listen on the device passed in.  To do that you'll have to
> open the particular board in the address.  Ideally you'd have a way
> of listening on all boards since you don't know from where your
> clients will come yet.

And the non-NULL will be a mx://... string, no?

As for listening on all boards, that is not supported by MX directly  
(a MX endpoint is specific to a NIC). If this is a requirement of  
PVFS (to be to use multiple Myricom NICs), then we have a couple of  
options. First, can PVFS open two BMI methods of the same type? If  
so, then we just have to ensure that all bmi_mx state is local to  
each process (no globals, etc.). If BMI cannot open two of the same  
type, then I will have to manage two (or more) MX endpoints in bmi_mx.

That then raises the issue of whether the NICs are in the same fabric  
or in disjoint fabrics. If the same, then I can send (and receive)  
using either one and we will need to consider some form of striping  
over the NICs to ensure that we maximize utilization of both. If they  
are disjoint, I will need to maintain separate peer lists for each  
and determine which NIC needs to send and receive for a given peer.

Using more than one board definitely complicates matters. ;-)  
Initially, I will support one board to get things going.

> On a client, you don't know at initialize time to what server(s)
> you'll need to connect.  Somehow you'll have to prepare the device
> as needed in the first sendunexpected call.
>
> Does this work?
>
> 		-- Pete

I am not worried about peer (i.e. server) state at initialize time.  
Sending to others is not an issue as long as the URI is passed as  
part of the send context. Does the client also get a listen_addr  
string or not? I would assume not. If so, I am then free to open any  
endpoint I want and then I would pass that info to the server before  
sending my first sendunexpected message (i.e. in my method's connect  
request message).

If the client does not get a listen_addr URI and if the machine has  
mutiple NICs, I will not know which one to open. I could have a  
#define that compiles in the board number but it would require that  
all machines use the same board. Any suggestions?

Scott


More information about the Pvfs2-developers mailing list