[Pvfs2-developers] pvfs kernel module crash on 2.4.21-sgi-hacked
Murali Vilayannur
murali.vilayannur at gmail.com
Wed Dec 20 14:49:47 EST 2006
Hi Pete,
Interesting that you mentioned that this worked prior to June. I don't
recall any checkins to this part of the code at all ..oh well.
I wonder why the parent directory entry (i.e. the directory being
opened) did not have a refcount > 0, perhaps we/VFS missed a dget()
in some path prior to the open...?
I think we may not have tested out this path or ran into this issue before :(
Is the memory on the system flaky or something? unlikely.. but still :)
As you have rightly deduced, we don't need the calls to
dcache_dir_open() (on dir opens).
This would have been needed had we made use of the libfs stuff. If you
look at fs/libfs.c on 2.6,
open() of a directory should create a new child dentry and save it in
->private_data.
Close should dput() that, while readdir() uses that as the cursor to
walk through the things
and so does lseek() (libfs is a librification of most commonly used
APIs that any in-memory file system does not need to reinvent, like
ramfs/relayfs etc).
Since pvfs2 has its own set of readdir/lseek implementations, this
cursor stuff is not needed
even on 2.6 kernels.
Can you retest with that stuff yanked out if you get time?
thanks,
Murali
> [7]kdb> bt
> Stack traceback for pid 12721
> 0xe00001b00b190000 12721 12719 1 7 R 0xe00001b00b1905a0
> *bash
> 0xe0000000044ffb90 __out_of_line_bug+0x70
> args (0x103, 0xe0000000045c59f0, 0x40b)
> kernel .text 0xe000000004400000 0xe0000000044ffb20
> 0xe0000000044ffbc0
> 0xe0000000045c59f0 d_alloc+0x1f0
> args (0xe00001b00b310b80, 0xe00000000503c2ac,
> 0xe00000300d632380, 0x0, 0xe00000000503c2a8)
> kernel .text 0xe000000004400000 0xe0000000045c5800
> 0xe0000000045c5b80
> 0xe0000000045b9f40 dcache_dir_open+0x40
> args (0xe00001300e858400, 0xe000033007a53c10,
> 0xa000000009d8f350, 0x38a) kernel .text 0xe000000004400000
> 0xe0000000045b9f00 0xe0000000045b9f80
> 0xa000000009d8f350 [pvfs2]pvfs2_file_open+0x2b0
> args (0xe00001300e858400, 0xe000033007a53b80,
> 0xa000000009dac7e8, 0xe00000000458af60, 0xa000000009dad7ec)
> pvfs2 .text 0xa000000009d840c0 0xa000000009d8f0a0
> 0xa000000009d8f3c0
> 0xe00000000458b180 dentry_open+0x240
> args (0xe00001b00b310b80, 0xe00001b07ba5ef80,
> 0xe000033007a53c18, 0xe0000000051ff200, 0xe000033007a53b80)
> kernel .text 0xe000000004400000 0xe00000000458af40
> 0xe00000000458b3c0
> 0xe00000000458af20 filp_open+0xc0
> args (0xe00000300d1c2000, 0x18800, 0x40000000000e86b0,
> 0xe00000000458b910, 0x792)
> kernel .text 0xe000000004400000 0xe00000000458ae60
> 0xe00000000458af40
> 0xe00000000458b910 sys_open+0xd0
> args (0x60000000000522a0, 0x10800, 0x40000000000e86b0,
> 0xc000000000000b19, 0x60000000000522c0)
> kernel .text 0xe000000004400000 0xe00000000458b840
> 0xe00000000458bb40
> 0xe00000000440e300 ia64_ret_from_syscall
> args (0x60000000000522a0, 0x60000000000522ab,
> 0x6000000000011308, 0x60000000000522a0, 0x40000000000e7d50)
> kernel .text 0xe000000004400000 0xe00000000440e300
> 0xe00000000440e320
>
> That hits this code in pvfs2_file_open:
>
> if (S_ISDIR(inode->i_mode))
> {
> ret = dcache_dir_open(inode, file);
> }
> else
> {
> ...
>
> which then dies in dget() as called by d_alloc() because the refcnt
> on file->f_dentry is zero.
>
> I'm not particularly motivated to work on fixing this problem for
> such an ancient kernel, that further has patches from SGI on top of
> it. But, pvfs kernel mount used to work on this kernel with the
> June 2006 CVS. I looked through the diff from then, but didn't see
> anything obvious in this area.
>
> Nosed around a bit and found that the only other caller of
> dcache_dir_open() in the tree is autofs4, which uses it directly in
> its _dir_operations struct when it instaniates a directory inode,
> not like it is done in pvfs.
>
> In 2.6 kernel, the call is there, and there are now two in-tree
> callers: fs/autofs4, and some virtual FS for cell.
>
> I'm a bit confused as to why we need to do anything in the directory
> open path; i.e. why is there even a function, and why does that
> function call dcache_dir_open()?
>
> If nobody is particularly excited about debugging this, no big deal.
> Troy's not too thrilled about crashing the Altix anymore, and maybe
> we can pressure the admins to switch to a 2.6 kernel.
>
> -- Pete
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list