[PVFS-developers] IOD File Descriptor Leak

Porter Don PorterDE@mercury.hendrix.edu
Tue, 16 Sep 2003 09:36:59 -0500


I have been noticing on some of the iods what appears to be a file
descriptor leak.  In looking at the iod code, it seems to be happening when
a job cancels between opening a file and doing a rw job.  The socket number
in the finfo struct is initialized to -1 and doesn't get set to a real
socket until a rw job gets it from the list.  If this field never gets
initialized, the finfo will not be cleaned up when the socket closes.  The
leak seems to be introduced because right now the do_rw_job function just
grabs the first finfo for the inode it wants and assigns its socket if it is
uninitialized, completely ignoring all but the first finfo on the linked
list.

This raises the question, is there any reason to have multiple file
descriptors for the same inode file at all?  The iods are single threaded
and are effectively sharing one anyway.  It seems that perhaps changing the
finfo structure to keep a linked list of sockets as a sort of "reference
count" (like the mgr) might be more efficient and plug up this leak.  I
suppose this would also need to be coupled with having something in
check_socks that would clean up unclaimed finfo structs after a period of
idleness.

Anyway, I am about to start working on this, but I thought I would bounce it
off of the list and see if anyone had any insights.

Thanks,
Don