[Pvfs2-developers] kernel module copy attrs race

Phil Carns carns at mcs.anl.gov
Fri Mar 28 08:23:47 EST 2008


Pete Wyckoff wrote:
> slang at mcs.anl.gov wrote on Tue, 25 Mar 2008 15:10 -0500:
>> In fact our error handling here looks to be a little wrong.  When we return 
>> 0 from d_revalidate, the kernel assumes the file has been removed and 
>> populates the dcache with a negative dentry (then tries to create the file 
>> in the open case).  But if getattr fails after lookup succeeds, we may have 
>> more than an invalid dentry.
>>
>> In most cases where getattr returns an error we probably want to push that 
>> error all the way back up the stack, instead of just populating the dcache 
>> with a negative dentry.  We could check for ENOENT from getattr and return 
>> 0, otherwise return the errno?  Do the same for lookup failure?
>>
>> Also, if the kernel gives us a null dentry, do we really want to return 
>> invalid dentry, or EINVAL with a big hairy gossip_err message to the log?  
>> At least in the current kernel, d_revalidate is never called with a null 
>> dentry.
> 
> I'm so far out of my element here.  What you say makes a lot of
> sense; however, we should hesitate to change anything in a big way,
> given the 84 different versions of linux we support.  Maybe a
> testcase would help to motivate this sort of problem.  Could be it
> only occurs when multiple clients race on create/remove.

It might be a good start just to put a gossip_err() on that "error code 
other than ENOENT" case in CVS so we can keep an eye on it and make sure 
it doesn't happen in normal scenarios for a little while before changing 
the behavior.  Probably ditto for the null dentry case.  I agree it 
sounds like the way it probably is supposed to work, though.

-Phil


More information about the Pvfs2-developers mailing list