[PVFS-developers] How to handle file unlinks when IOD(s) are down

David S. Metheny david.s.metheny at conwaycorp.net
Tue Oct 26 14:23:54 EDT 2004


	We are seeing that file deletes are leaving dangling inode files
(future .saveme files) on the IOD(s) when a delete is issued from a client,
and one or more IOD(s) are down (not accepting connections). The mgr will
delete the metadata file before checking and attempting to delete the files
from the IOD(s). There is a bad return from one or more of the IODs, but the
end result is the metadata file is gone, and the IOD(s) have dangling inode
files on them. I'm looking at working on a patch for this, but wanted a
little direction. 

	I see that the do_unlink call executes md_unlink before calling the
IOD(s). The md_unlink actually opens the file to keep the inode from being
reused after the deletion. Would it be OK to just make the call to the
IOD(s) to try the deletion first, before executing the deletion on the mgr.
This could also keep from having to open the metafile to keep the inode from
being re-used. 

Some things that I don't yet understand how to handle, or if they would
throw big kinks in the above idea:

- If only one IOD doesn't delete the file (or just one does delete the
file), we can't recover any of the data, so we really want to go ahead and
delete the mgr file anyhow, instead of leaving a bad file viewable to all.
Maybe we just want to log what happens so we can perform some sort of active
cleanup, instead of waiting for it to turn into a .saveme file. 

- Also, it seems that we delete the manager file first, because we don't
want anyone trying to access the file while the IOD deletes are occurring. 

	Are there any tools available to clean up a PVFS cluster that has
these dangling inodes, rather than wait for these files to be renamed to a
.saveme file when the inode number is reused. 





More information about the PVFS-developers mailing list