[Pvfs2-developers] patch: namei bug fixes
Phil Carns
pcarns at wastedcycles.org
Tue Mar 27 16:33:59 EST 2007
Just to clarify a little bit, there is actually a problem with the 2.6
code path here as well. I went back and ran some tests with and without
the namei.patch on a RHEL4 box (2.6.9-something) to confirm. If I leave
the patch out, then the following two LTP tests (open08, statfs02) fail:
<<<test_start>>>
tag=open08 stime=1175025961
cmdline="open08"
contacts=""
analysis=exit
initiation_status="ok"
<<<test_output>>>
open08 1 PASS : expected failure - errno = 17 : File exists
open08 2 PASS : expected failure - errno = 21 : Is a directory
open08 3 PASS : expected failure - errno = 20 : Not a directory
open08 4 FAIL : unexpected error - 2 : No such file or directory
- expected 36
open08 5 PASS : expected failure - errno = 13 : Permission denied
open08 6 PASS : expected failure - errno = 14 : Bad address
<<<execution_status>>>
duration=0 termination_type=exited termination_id=1 corefile=no
cutime=0 cstime=1
<<<test_end>>>
<<<test_start>>>
tag=statfs02 stime=1175026039
cmdline="statfs02"
contacts=""
analysis=exit
initiation_status="ok"
<<<test_output>>>
statfs02 1 PASS : expected failure - errno = 20 : Not a directory
statfs02 2 PASS : expected failure - errno = 2 : No such file or
directory
statfs02 3 FAIL : unexpected error - 2 : No such file or directory
- expected 36
statfs02 4 PASS : expected failure - errno = 14 : Bad address
statfs02 5 PASS : expected failure - errno = 14 : Bad address
<<<execution_status>>>
duration=0 termination_type=exited termination_id=1 corefile=no
cutime=0 cstime=0
<<<test_end>>>
They pass without any trouble if I apply namei.patch.
I haven't dug into your technical comments below yet (inodes and
dentries make my head hurt), but I'll try and catch up on them soon.
From a high level, it does sound like there are some odd things in the
patch, but I don't know which part of it is relevant. I just narrowed
down cvsps patchsets until I found a code snippet that seemed to be the
source of my problems and generated a patch to revert it :)
I wish that I could say that we could stop using 2.4 soon, but we can't
quite seem to get to that point yet :(
-Phil
Sam Lang wrote:
>
> Hi Phil,
>
> I have comments about the patch inline. Two general comments though.
> Reading this code (at least for me) has been hard because of all the
> feature check #ifdefs. I think trying to support all these different
> kernel versions gives us a lot of headache that maybe we don't need
> (I've seen this recently with HAVE_AIO_VFS_SUPPORT). Any chance we
> could pair it down to more recent versions? Do you guys expect to
> discontinue supporting 2.4 in the near future? Would it be possible to
> say that future pvfs releases only support 2.6 (maybe even > 2.6.x)?
> Anyone that has an older kernel has to use an older pvfs version?
>
> -sam
>
> On Mar 20, 2007, at 9:36 AM, Phil Carns wrote:
>
>> I am sending this patch in a separate email because it may need some
>> discussion to hash out. Sometime in the past several months, the
>> pvfs2_lookup() function in namei.c changed (I think along with
>> something not directly related, but I don't recall exactly what
>> happened now).
>>
>> This change caused several directory related bugs to show up for us
>> on 2.4 and 2.6 kernels. The 2.4 one was more severe, though, because
>> it caused a kernel panic. It could be triggered by the "rename01"
>> test in LTP, or by the following manual steps:
>>
>> [root at rhel3 pvfs2]# mkdir testdir
>> [root at rhel3 pvfs2]# cd testdir
>> [root at rhel3 testdir]# mkdir dir1
>> [root at rhel3 testdir]# mv dir1 dir2
>> [root at rhel3 testdir]# ls -alh
>> total 12K
>> drwxr-xr-x 1 root root 4.0K Dec 1 12:18 .
>> drwxrwxrwt 1 root root 4.0K Dec 1 12:17 ..
>> drwxr-xr-x 1 root root 4.0K Dec 1 12:17 dir2
>> [root at centos-rhel3 testdir]# rm dir2
>> <crash>
>>
>> ... so it had something to do with removing a directory that had
>> previously been renamed.
>>
>> At any rate, I don't know enough about dentries and inodes anymore to
>> truly understand the old logic that used to work or the newer logic
>> that causes us problems. This patch just naively reverts some of the
>> logic in namei.c to the point that it works again for us (without
>> changing anything else that was in that set of commits). With this
>> in place, we don't see any more test case failures or kernel panics
>> on 2.6 or 2.4.
>>
>> We have been using this patch for several months with success, but it
>> would probably be a good idea for someone more familiar with this
>> code to look at the change more carefully.
>>
>
> I don't have access to a 2.4 kernel with root at the moment, but
> following the code paths I have some comments. Hopefully Murali can
> chime in at some point and correct me where I'm wrong about stuff.
>
>> -Phil
>> Index: pvfs2_src/src/kernel/linux-2.6/namei.c
>> ===================================================================
>> --- pvfs2_src/src/kernel/linux-2.6/namei.c (revision 2909)
>> +++ pvfs2_src/src/kernel/linux-2.6/namei.c (revision 2910)
>> @@ -164,19 +164,24 @@
>> inode = pvfs2_iget(sb, &new_op->downcall.resp.lookup.refn);
>> if (inode && !is_bad_inode(inode))
>> {
>> - struct dentry *res;
>> + found_pvfs2_inode = PVFS2_I(inode);
>>
>> + /* store the retrieved handle and fs_id */
>> + found_pvfs2_inode->refn = new_op-
>> >downcall.resp.lookup.refn;
>> +
>
>
> I think this is redundant. Unless you're using a _very_ old 2.4
> kernel, the pvfs2_iget call sets the fsid and handle in the pvfs2 inode
> pointer of the inode. Basically, pvfs2_iget translates to a
> pvfs2_iget_common (with keep_locked == 0), which means that
> pvfs2_set_inode will get called, which will do the same thing as
> above. The only way I can see that this wouldn't happen is if
> iget4_locked isn't supported by your kernel version, but it appears to
> have been in 2.4.25 and up, so you'd have to be running with something
> pretty old.
>
> I think that Murali added pvfs2_iget to abstract out this manual
> setting, which will still appear to do in some places.
>
>> /* update dentry/inode pair into dcache */
>> dentry->d_op = &pvfs2_dentry_operations;
>>
>> - res = pvfs2_d_splice_alias(dentry, inode);
>> + pvfs2_d_splice_alias(dentry, inode);
>>
>> gossip_debug(GOSSIP_NAME_DEBUG, "Lookup success (inode
>> ct = %d)\n",
>> (int)atomic_read(&inode->i_count));
>> +#if 0
>> op_release(new_op);
>> if (res)
>> res->d_op = &pvfs2_dentry_operations;
>> return res;
>> +#endif
>
>
> Here too, if you're running 2.4, then res is guaranteed to equal
> dentry, so while setting the dentry_operations struct to d_op is
> redundant for 2.4, its needed for 2.6 in the case where we found a
> disconnected dentry and returned that instead. The only difference
> here in the 2.4 code path that I can see is that if a new entry _was_
> added to the dcache, you return NULL now, instead of returning the new
> entry. Since the new entry is in the dcache, maybe that's ok (and
> expected for 2.4). For 2.6 it seems clear that the new dentry is
> supposed to be returned.
>
>> }
>> else if (inode && is_bad_inode(inode))
>> {
>> @@ -227,7 +232,14 @@
>> }
>>
>> op_release(new_op);
>> - return NULL;
>> + if(ret != -ENOENT)
>> + {
>> + return ERR_PTR(ret);
>> + }
>> + else
>> + {
>> + return NULL;
>> + }
>
>
> This might be what's causing your failures with 2.4. It expects NULL
> if a dentry was added to the dcache (which happens with ENOENT), or a
> non-null error pointer for any other error. As mentioned above, I
> think the semantics of the return from lookup have changed in 2.6, so
> we might want to change this up a bit.
>
>> }
>>
>> /* return 0 on success; non-zero otherwise */
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
>
More information about the Pvfs2-developers
mailing list