[PVFS2-developers] patches: st_nlink and rename fixes
Phil Carns
pcarns at wastedcycles.org
Thu Jun 30 22:05:42 EDT 2005
dir-nlink.patch
---------------------
This forces the st_nlink field to stay fixed at 2 for all pvfs2
directories. You can view this field with stat(). The "normal"
semantic is for st_nlink to provide a count of how many subdirectories
are within a directory, including "." and "..", but not including files.
Prior to this patch, pvfs2 tried to maintain this, but the count
was a little off, and it couldn't keep up with subdirs added or removed
by other clients anyway. This led to negative st_nlink counts, with
some possibly odd side effects.
Making st_nlink track the "right" value would be very expensive in pvfs2
because you might cross meta servers to get the information, and is of
dubious practical use anyway :) For historical reference, PVFS1 kept
this value fixed as well, though it went with st_nlink=1 instead of
st_nlink=2.
dirent-count.patch
----------------------
This adds a new directory attribute field that indicates how many
entries (of any type) are in a given directory. The server already had
the ability to gather this information, but it just wasn't being
reported. A client can read this with getattr(), but it can't be
modified directly. This may have a couple of uses, one example (used in
patch below) is to be able to detect quickly if a directory is empty or not.
safer-rename.patch
-----------------------
This may take a minute to explain, so bear with me :) The summary is
that the rename() operation in pvfs2 is broken in some cases. For
example, you can rename over the top of a non-empty directory, and you
can rename a file to a directory or vice versa. Neither of those should
be allowed. The core of the problem is that pvfs2 can't easily detect
those kinds of failures until late in the rename() process, and it
doesn't have any recovery steps to undo a partial rename once the
failure is detected. As a side note, most of these problems are an
issue mainly from the system interface. The kernel vfs adds some sanity
checking if you access the FS that way.
Rename is the trickiest pvfs2 operation because it requires consistent
metadata updates on as many as 4 objects, which can be scattered across
servers: source object, target object, source parent, and target parent.
So, to get all of this right and avoid races you have a couple of options:
1) Shared locks. Bad for hopefully obvious reasons.
2) Server to server communication to coordinate updates- probably the
best long term solution, but hard to do cleanly in the near term
3) Checking preconditions on the client before starting rename(). This
leaves a potential race window, but is enough to get the general case
rename() working.
4) Let failure happen late in the rename process, and try to put all the
objects back in the right place to recover. This has a race like 3),
but more dangerous.
So... this patch implements option 3. It basically does a getattr on
both the source and target objects (if applicable, the target is
optional) before starting a rename(). From this it can check:
permissions, empti-ness of target if it is a directory, type of both
objects, etc. It then errors out locally if needed.
I'm interested in seeing 2) happen eventually, but for now this is
enough to at least make single client rename() operations work safely
without too much fuss.
-Phil
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dir-nlink.patch
Type: text/x-patch
Size: 3408 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20050630/1f437360/dir-nlink.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dirent-count.patch
Type: text/x-patch
Size: 13185 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20050630/1f437360/dirent-count.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: safer-rename.patch
Type: text/x-patch
Size: 7718 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20050630/1f437360/safer-rename.bin
More information about the PVFS2-developers
mailing list