[PVFS2-developers] patches: st_nlink and rename fixes

Phil Carns pcarns at wastedcycles.org
Thu Jun 30 22:05:42 EDT 2005


dir-nlink.patch
---------------------
This forces the st_nlink field to stay fixed at 2 for all pvfs2 
directories.  You can view this field with stat().  The "normal" 
semantic is for st_nlink to provide a count of how many subdirectories 
are within a directory, including "." and "..", but not including files. 
     Prior to this patch, pvfs2 tried to maintain this, but the count 
was a little off, and it couldn't keep up with subdirs added or removed 
by other clients anyway.  This led to negative st_nlink counts, with 
some possibly odd side effects.

Making st_nlink track the "right" value would be very expensive in pvfs2 
because you might cross meta servers to get the information, and is of 
dubious practical use anyway :)   For historical reference, PVFS1 kept 
this value fixed as well, though it went with st_nlink=1 instead of 
st_nlink=2.

dirent-count.patch
----------------------
This adds a new directory attribute field that indicates how many 
entries (of any type) are in a given directory.  The server already had 
the ability to gather this information, but it just wasn't being 
reported.   A client can read this with getattr(), but it can't be 
modified directly.  This may have a couple of uses, one example (used in 
patch below) is to be able to detect quickly if a directory is empty or not.

safer-rename.patch
-----------------------
This may take a minute to explain, so bear with me :)  The summary is 
that the rename() operation in pvfs2 is broken in some cases.  For 
example, you can rename over the top of a non-empty directory, and you 
can rename a file to a directory or vice versa.  Neither of those should 
be allowed.  The core of the problem is that pvfs2 can't easily detect 
those kinds of failures until late in the rename() process, and it 
doesn't have any recovery steps to undo a partial rename once the 
failure is detected.  As a side note, most of these problems are an 
issue mainly from the system interface.  The kernel vfs adds some sanity 
checking if you access the FS that way.

Rename is the trickiest pvfs2 operation because it requires consistent 
metadata updates on as many as 4 objects, which can be scattered across 
servers: source object, target object, source parent, and target parent. 
  So, to get all of this right and avoid races you have a couple of options:

1) Shared locks.  Bad for hopefully obvious reasons.
2) Server to server communication to coordinate updates- probably the 
best long term solution, but hard to do cleanly in the near term
3) Checking preconditions on the client before starting rename().  This 
leaves a potential race window, but is enough to get the general case 
rename() working.
4) Let failure happen late in the rename process, and try to put all the 
objects back in the right place to recover.  This has a race like 3), 
but more dangerous.

So... this patch implements option 3.  It basically does a getattr on 
both the source and target objects (if applicable, the target is 
optional) before starting a rename().  From this it can check: 
permissions, empti-ness of target if it is a directory, type of both 
objects, etc.  It then errors out locally if needed.

I'm interested in seeing 2) happen eventually, but for now this is 
enough to at least make single client rename() operations work safely 
without too much fuss.

-Phil


-------------- next part --------------
A non-text attachment was scrubbed...
Name: dir-nlink.patch
Type: text/x-patch
Size: 3408 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20050630/1f437360/dir-nlink.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dirent-count.patch
Type: text/x-patch
Size: 13185 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20050630/1f437360/dirent-count.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: safer-rename.patch
Type: text/x-patch
Size: 7718 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20050630/1f437360/safer-rename.bin


More information about the PVFS2-developers mailing list