[Pvfs2-developers] Initializing a BMI method
Scott Atchley
atchley at myri.com
Thu Sep 21 13:40:38 EDT 2006
On Sep 21, 2006, at 11:23 AM, Pete Wyckoff wrote:
> atchley at myri.com wrote on Thu, 21 Sep 2006 10:19 -0400:
>> If I use a URI of mx://hostname:board:endpoint for each peer, I am
>> wondering how the local bmi_mx gets its own name (in order to know
>> which NIC to use and which MX endpoint).
>>
>> Looking through bmi.c, activate_method() calls
>> BMI_meth_method_addr_lookup(listen_addr) before calling
>> BMI_meth_initialize(). The function activate_method() is called from
>> BMI_initialize() which indicates that listen_addr is "a comma
>> separated list of addresses to listen on for each method." Should I
>> assume that the following will happen to startup bmi_mx:
>>
>> new_addr = BMI_meth_method_addr_lookup("mx://
>> myhostname:myboard:myport");
>> ...
>> BMI_meth_initialize(new_addr, ...)
>>
>> Is this a correct assumption for the method_addr_lookup() and order
>> of operations?
>
> Yes on the assumption, thus BMI_mx_method_addr_lookup() should not
> look at any state that needs to be initialized in
> BMI_meth_initialize. It just munges strings. The IB implementation
> relies on a static variable to check if it has been initialized or
> not.
>
> I'm not thoroughly happy with this situation. If it is a major
> problem for MX, speak up. We can fix the API.
I do not foresee a problem yet, but if one comes up, I'll let you
know. :-)
>> If so, then my method_addr_lookup() function has to check if bmi_mx
>> has been initialized before using any lists, locks, etc., no?
>
> In BMI_mx_initialize, do whatever basic NIC-independent setup you
> need to do. If called with a non-NULL listen_addr, act as a server
> and listen on the device passed in. To do that you'll have to
> open the particular board in the address. Ideally you'd have a way
> of listening on all boards since you don't know from where your
> clients will come yet.
And the non-NULL will be a mx://... string, no?
As for listening on all boards, that is not supported by MX directly
(a MX endpoint is specific to a NIC). If this is a requirement of
PVFS (to be to use multiple Myricom NICs), then we have a couple of
options. First, can PVFS open two BMI methods of the same type? If
so, then we just have to ensure that all bmi_mx state is local to
each process (no globals, etc.). If BMI cannot open two of the same
type, then I will have to manage two (or more) MX endpoints in bmi_mx.
That then raises the issue of whether the NICs are in the same fabric
or in disjoint fabrics. If the same, then I can send (and receive)
using either one and we will need to consider some form of striping
over the NICs to ensure that we maximize utilization of both. If they
are disjoint, I will need to maintain separate peer lists for each
and determine which NIC needs to send and receive for a given peer.
Using more than one board definitely complicates matters. ;-)
Initially, I will support one board to get things going.
> On a client, you don't know at initialize time to what server(s)
> you'll need to connect. Somehow you'll have to prepare the device
> as needed in the first sendunexpected call.
>
> Does this work?
>
> -- Pete
I am not worried about peer (i.e. server) state at initialize time.
Sending to others is not an issue as long as the URI is passed as
part of the send context. Does the client also get a listen_addr
string or not? I would assume not. If so, I am then free to open any
endpoint I want and then I would pass that info to the server before
sending my first sendunexpected message (i.e. in my method's connect
request message).
If the client does not get a listen_addr URI and if the machine has
mutiple NICs, I will not know which one to open. I could have a
#define that compiles in the board number but it would require that
all machines use the same board. Any suggestions?
Scott
More information about the Pvfs2-developers
mailing list