[Pvfs2-developers] Re: the halloween bug fixed

Sam Lang slang at mcs.anl.gov
Mon Oct 8 14:50:26 EDT 2007


On Oct 8, 2007, at 12:31 PM, Scott Atchley wrote:

> On Oct 8, 2007, at 12:13 PM, Sam Lang wrote:
>
>> With mx, it looks like there's a limit on the number of  
>> connections from a peer (BMX_PEER_RX_NUM == 20).  As new  
>> connections are received the idle connections are closed?  Should  
>> bmi_method_addr_forget_callback be called from there?
>>
>> Thanks,
>> -sam
>
> Hi Sam,
>
> The BMX_PEER_RX_NUM simply sets the number of unexpected requests  
> to pre-allocate when a new peer is registered. This the calls to  
> malloc() for unexpected messages up to this threshold. It does not  
> control re-connect attempts.
>
> In bmi_mx, a peer is only registered once. If any communication  
> fails, the connection state is marked BMX_PEER_DISCONNECT. It does  
> not delete the peer unless BMI_set_info(DROP_ADDR) is called.
>
> When a client reconnects, the server cancels any pending requests  
> from the peer, calls mx_iconnect() to establish a new connect back  
> to the client, and then sends a CONN_ACK message. After that,  
> communication resumes.
>
> If a connection is idle, bmi_mx (and MX) will happily leave it open  
> as long as the process is alive.
>
> I am not sure what you are proposing that  
> bmi_method_addr_forget_callback() do.

bmi_method_addr_reg_callback is called at mx.c:2290, in icon_ack,  
where peer->mxp_exist is set to 1.  It doesn't look like that  
variable is used (checked or set) anywhere else.  Does cleanup/ 
shutdown of that peer ever occur?  Also, is it possible for the same  
client node to appear as multiple different peers to the server  
node?  It sounds like the answer is no, but I just want to make sure.

-sam


>
> By the way, there is a limit in bmi_mx to the number of known peers  
> which is currently set at 2^20. Having a high value of peers  
> (active and/or idle) should not cause any slow-down issues in  
> bmi_mx. Each incoming message (expected or unexpected) includes the  
> source endpoint address and MX has a function that allows you to  
> register a context with each endpoint address. bmi_mx uses this to  
> associate the peer's data structure with the peer's endpoint. So  
> for even unexpected messages, it is a O(1) lookup.
>
> Scott
>



More information about the Pvfs2-developers mailing list