[PVFS-developers] Recovering from an IOD failure
Porter Don
PorterDE at mercury.hendrix.edu
Mon Feb 16 14:23:48 EST 2004
Yeah, I have a test case that always causes an fd leak without the patch,
but cannot with the patch. I am currently doing some stress testing against
it to see if there were any inadvertant effect (I would be very surprised if
there were, but better safe than sorry).
-----Original Message-----
From: Rob Ross
To: Porter Don
Cc: 'pvfs-developers at www.beowulf-underground.org'
Sent: 2/16/04 1:11 PM
Subject: RE:[PVFS-developers] Recovering from an IOD failure
Heh, cool; it was definitely worth a shot.
Any news re: the 1B problems?
Thanks!
Rob
On Mon, 16 Feb 2004, Porter Don wrote:
> >2) In mgr.c/send_req, if the manager had an open socket connection
that
> >dies, there is no retry logic. It seems like the manager ought to at
least
> >try once to reestablish the connection on an EPIPE. This would
primarily
> >help the case where an iod died and came back up between requests.
>
> Yeah, this was a bad idea. I tinkered around with it and it quickly
got
> into an infinite loop, so I am going to go with Rob in saying that the
retry
> logic is just going to have to live in the client.
More information about the PVFS-developers
mailing list