[Pvfs2-developers] Re: the halloween bug fixed

Scott Atchley atchley at myri.com
Mon Oct 8 13:31:34 EDT 2007


On Oct 8, 2007, at 12:13 PM, Sam Lang wrote:

> With mx, it looks like there's a limit on the number of connections  
> from a peer (BMX_PEER_RX_NUM == 20).  As new connections are  
> received the idle connections are closed?  Should  
> bmi_method_addr_forget_callback be called from there?
>
> Thanks,
> -sam

Hi Sam,

The BMX_PEER_RX_NUM simply sets the number of unexpected requests to  
pre-allocate when a new peer is registered. This the calls to malloc 
() for unexpected messages up to this threshold. It does not control  
re-connect attempts.

In bmi_mx, a peer is only registered once. If any communication  
fails, the connection state is marked BMX_PEER_DISCONNECT. It does  
not delete the peer unless BMI_set_info(DROP_ADDR) is called.

When a client reconnects, the server cancels any pending requests  
from the peer, calls mx_iconnect() to establish a new connect back to  
the client, and then sends a CONN_ACK message. After that,  
communication resumes.

If a connection is idle, bmi_mx (and MX) will happily leave it open  
as long as the process is alive.

I am not sure what you are proposing that  
bmi_method_addr_forget_callback() do.

By the way, there is a limit in bmi_mx to the number of known peers  
which is currently set at 2^20. Having a high value of peers (active  
and/or idle) should not cause any slow-down issues in bmi_mx. Each  
incoming message (expected or unexpected) includes the source  
endpoint address and MX has a function that allows you to register a  
context with each endpoint address. bmi_mx uses this to associate the  
peer's data structure with the peer's endpoint. So for even  
unexpected messages, it is a O(1) lookup.

Scott


More information about the Pvfs2-developers mailing list