[Pvfs2-developers] patch: namei bug fixes

Phil Carns pcarns at wastedcycles.org
Tue Mar 27 16:33:59 EST 2007


Just to clarify a little bit, there is actually a problem with the 2.6 
code path here as well.  I went back and ran some tests with and without 
the namei.patch on a RHEL4 box (2.6.9-something) to confirm.  If I leave 
the patch out, then the following two LTP tests (open08, statfs02) fail:

<<<test_start>>>
tag=open08 stime=1175025961
cmdline="open08"
contacts=""
analysis=exit
initiation_status="ok"
<<<test_output>>>
open08      1  PASS  :  expected failure - errno = 17 : File exists
open08      2  PASS  :  expected failure - errno = 21 : Is a directory
open08      3  PASS  :  expected failure - errno = 20 : Not a directory
open08      4  FAIL  :  unexpected error - 2 : No such file or directory 
- expected 36
open08      5  PASS  :  expected failure - errno = 13 : Permission denied
open08      6  PASS  :  expected failure - errno = 14 : Bad address
<<<execution_status>>>
duration=0 termination_type=exited termination_id=1 corefile=no
cutime=0 cstime=1
<<<test_end>>>

<<<test_start>>>
tag=statfs02 stime=1175026039
cmdline="statfs02"
contacts=""
analysis=exit
initiation_status="ok"
<<<test_output>>>
statfs02    1  PASS  :  expected failure - errno = 20 : Not a directory
statfs02    2  PASS  :  expected failure - errno = 2 : No such file or 
directory
statfs02    3  FAIL  :  unexpected error - 2 : No such file or directory 
- expected 36
statfs02    4  PASS  :  expected failure - errno = 14 : Bad address
statfs02    5  PASS  :  expected failure - errno = 14 : Bad address
<<<execution_status>>>
duration=0 termination_type=exited termination_id=1 corefile=no
cutime=0 cstime=0
<<<test_end>>>

They pass without any trouble if I apply namei.patch.

I haven't dug into your technical comments below yet (inodes and 
dentries make my head hurt), but I'll try and catch up on them soon. 
 From a high level, it does sound like there are some odd things in the 
patch, but I don't know which part of it is relevant.  I just narrowed 
down cvsps patchsets until I found a code snippet that seemed to be the 
source of my problems and generated a patch to revert it :)

I wish that I could say that we could stop using 2.4 soon, but we can't 
quite seem to get to that point yet :(

-Phil

Sam Lang wrote:
> 
> Hi Phil,
> 
> I have comments about the patch inline.  Two general comments  though.  
> Reading this code (at least for me) has been hard because of  all the 
> feature check #ifdefs.  I think trying to support all these  different 
> kernel versions gives us a lot of headache that maybe we  don't need 
> (I've seen this recently with HAVE_AIO_VFS_SUPPORT).  Any  chance we 
> could pair it down to more recent versions?  Do you guys  expect to 
> discontinue supporting 2.4 in the near future?  Would it be  possible to 
> say that future pvfs releases only support 2.6 (maybe  even > 2.6.x)?  
> Anyone that has an older kernel has to use an older  pvfs version?
> 
> -sam
> 
> On Mar 20, 2007, at 9:36 AM, Phil Carns wrote:
> 
>> I am sending this patch in a separate email because it may need  some 
>> discussion to hash out.  Sometime in the past several months,  the 
>> pvfs2_lookup() function in namei.c changed (I think along with  
>> something not directly related, but I don't recall exactly what  
>> happened now).
>>
>> This change caused several directory related bugs to show up for us  
>> on 2.4 and 2.6 kernels.  The 2.4 one was more severe, though,  because 
>> it caused a kernel panic.  It could be triggered by the  "rename01" 
>> test in LTP, or by the following manual steps:
>>
>> [root at rhel3 pvfs2]# mkdir testdir
>> [root at rhel3 pvfs2]# cd testdir
>> [root at rhel3 testdir]# mkdir dir1
>> [root at rhel3 testdir]# mv dir1 dir2
>> [root at rhel3 testdir]# ls -alh
>> total 12K
>> drwxr-xr-x    1 root     root         4.0K Dec  1 12:18 .
>> drwxrwxrwt    1 root     root         4.0K Dec  1 12:17 ..
>> drwxr-xr-x    1 root     root         4.0K Dec  1 12:17 dir2
>> [root at centos-rhel3 testdir]# rm dir2
>> <crash>
>>
>> ... so it had something to do with removing a directory that had  
>> previously been renamed.
>>
>> At any rate, I don't know enough about dentries and inodes anymore  to 
>> truly understand the old logic that used to work or the newer  logic 
>> that causes us problems.  This patch just naively reverts  some of the 
>> logic in namei.c to the point that it works again for  us (without 
>> changing anything else that was in that set of  commits).  With this 
>> in place, we don't see any more test case  failures or kernel panics 
>> on 2.6 or 2.4.
>>
>> We have been using this patch for several months with success, but  it 
>> would probably be a good idea for someone more familiar with  this 
>> code to look at the change more carefully.
>>
> 
> I don't have access to a 2.4 kernel with root at the moment, but  
> following the code paths I have some comments.  Hopefully Murali can  
> chime in at some point and correct me where I'm wrong about stuff.
> 
>> -Phil
>> Index: pvfs2_src/src/kernel/linux-2.6/namei.c
>> ===================================================================
>> --- pvfs2_src/src/kernel/linux-2.6/namei.c    (revision 2909)
>> +++ pvfs2_src/src/kernel/linux-2.6/namei.c    (revision 2910)
>> @@ -164,19 +164,24 @@
>>      inode = pvfs2_iget(sb, &new_op->downcall.resp.lookup.refn);
>>      if (inode && !is_bad_inode(inode))
>>      {
>> -            struct dentry *res;
>> +            found_pvfs2_inode = PVFS2_I(inode);
>>
>> +            /* store the retrieved handle and fs_id */
>> +            found_pvfs2_inode->refn = new_op- 
>> >downcall.resp.lookup.refn;
>> +
> 
> 
> I think this is redundant.  Unless you're using a _very_ old 2.4  
> kernel, the pvfs2_iget call sets the fsid and handle in the pvfs2  inode 
> pointer of the inode.  Basically, pvfs2_iget translates to a  
> pvfs2_iget_common (with keep_locked == 0), which means that  
> pvfs2_set_inode will get called, which will do the same thing as  
> above.  The only way I can see that this wouldn't happen is if  
> iget4_locked isn't supported by your kernel version, but it appears  to 
> have been in 2.4.25 and up, so you'd have to be running with  something 
> pretty old.
> 
> I think that Murali added pvfs2_iget to abstract out this manual  
> setting, which will still appear to do in some places.
> 
>>          /* update dentry/inode pair into dcache */
>>          dentry->d_op = &pvfs2_dentry_operations;
>>
>> -            res = pvfs2_d_splice_alias(dentry, inode);
>> +            pvfs2_d_splice_alias(dentry, inode);
>>
>>              gossip_debug(GOSSIP_NAME_DEBUG, "Lookup success (inode  
>> ct = %d)\n",
>>                          (int)atomic_read(&inode->i_count));
>> +#if 0
>>              op_release(new_op);
>>              if (res)
>>                  res->d_op = &pvfs2_dentry_operations;
>>              return res;
>> +#endif
> 
> 
> Here too, if you're running 2.4, then res is guaranteed to equal  
> dentry, so while setting the dentry_operations struct to d_op is  
> redundant for 2.4, its needed for 2.6 in the case where we found a  
> disconnected dentry and returned that instead.  The only difference  
> here in the 2.4 code path that I can see is that if a new entry _was_  
> added to the dcache, you return NULL now, instead of returning the  new 
> entry.  Since the new entry is in the dcache, maybe that's ok  (and 
> expected for 2.4).  For 2.6 it seems clear that the new dentry  is 
> supposed to be returned.
> 
>>      }
>>          else if (inode && is_bad_inode(inode))
>>          {
>> @@ -227,7 +232,14 @@
>>      }
>>
>>      op_release(new_op);
>> -    return NULL;
>> +    if(ret != -ENOENT)
>> +    {
>> +        return ERR_PTR(ret);
>> +    }
>> +    else
>> +    {
>> +        return NULL;
>> +    }
> 
> 
> This might be what's causing your failures with 2.4.  It expects NULL  
> if a dentry was added to the dcache (which happens with ENOENT), or a  
> non-null error pointer for any other error.  As mentioned above, I  
> think the semantics of the return from lookup have changed in 2.6, so  
> we might want to change this up a bit.
> 
>>  }
>>
>>  /* return 0 on success; non-zero otherwise */
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> 
> 



More information about the Pvfs2-developers mailing list