[Pvfs2-developers] Re: [Pvfs2-users] pvfs-2.7.1 client causes kernel 2.6.25 bug (?)

Sam Lang slang at mcs.anl.gov
Tue Jun 3 18:41:57 EDT 2008


Hi Andrew,

Were you seeing any other problems with the PVFS volume before the  
unmount?  Did a directory listing hang or anything like that?

I've included a report of the description of the problem in case other  
PVFS developers have some ideas.  The bug message is saying that the  
current directory (.) has a reference count of 1.  Its a weird place  
to see that error though.  We just added the current (.) and parent  
(..) dirs to a readdir call through filldir.  Also, the kernel code  
still cleans up the dentry anyway after reporting that message, and  
the pvfs2 volume gets unmounted, so I'm curious as to how this causes  
the deterioration over time.  Also, we don't ever increment the  
d_count of a dentry ourselves, so somehow the interaction with the  
kernel interfaces is causing the d_count to get above 1 (it starts out  
at 1), and then get decremented later.

-sam

On Jun 2, 2008, at 10:15 PM, Andrew Pochinsky wrote:

> Hi,
> 	I found the following in the logs on a client, and I think it might  
> be interesting to pvfs2 developers. After it happened, the client  
> slowly deteriorated over time. Yesterday it started losing other  
> file system and has to be powercycled. The machine is 4 core x86-64  
> (Xeon E5410  @ 2.33GHz) running openSUSE 10.3 with kernel 2.6.25 on  
> 8GB of memory.
> Thanks,
> --andrew
>
> May 20 12:45:45 home kernel: BUG: Dentry ffff8101e403b970{i=0,n=.}  
> still in use (1) [unmount of pvfs2 pvfs2]
> May 20 12:45:45 home kernel: ------------[ cut here ]------------
> May 20 12:45:45 home kernel: kernel BUG at fs/dcache.c:637!
> May 20 12:45:45 home kernel: invalid opcode: 0000 [1] SMP
> May 20 12:45:45 home kernel: CPU 3
> May 20 12:45:45 home kernel: Modules linked in: pvfs2 iptable_filter  
> ip_tables x_tables bonding ipv6 microcode firmw
> are_class ext3 jbd mbcache loop e1000e sr_mod iTCO_wdt i2c_i801  
> i2c_core iTCO_vendor_support cdrom sg linear raid456
> async_xor async_memcpy async_tx xor raid0 ehci_hcd uhci_hcd usbcore  
> sd_mod dm_snapshot edd dm_mod raid1 reiserfs 3w
> _9xxx ata_piix ahci libata scsi_mod
> May 20 12:45:45 home kernel: Pid: 3089, comm: umount Not tainted  
> 2.6.25-fs1 #1
> May 20 12:45:45 home kernel: RIP: 0010:[<ffffffff80295080>]   
> [<ffffffff80295080>] shrink_dcache_for_umount_subtree+0
> x123/0x1f5
> May 20 12:45:45 home kernel: RSP: 0018:ffff81022e5bfdd8  EFLAGS:  
> 00010292
> May 20 12:45:45 home kernel: RAX: 0000000000000053 RBX:  
> ffff8101e403b970 RCX: 0000000000000001
> May 20 12:45:45 home kernel: RDX: 0000000100000000 RSI:  
> 0000000000000096 RDI: ffffffff80482da0
> May 20 12:45:45 home kernel: RBP: ffff810239450160 R08:  
> ffffffff80482d90 R09: ffff810001011308
> May 20 12:45:45 home kernel: R10: 0000000000000001 R11:  
> 0000000000000000 R12: ffff8101e403b9d0
> May 20 12:45:45 home kernel: R13: 0000000000000000 R14:  
> 0000000000000000 R15: 0000000000000000
> May 20 12:45:46 home kernel: FS:  00007f77bde866f0(0000)  
> GS:ffff81023f063240(0000) knlGS:0000000000000000
> May 20 12:45:46 home kernel: CS:  0010 DS: 0000 ES: 0000 CR0:  
> 000000008005003b
> May 20 12:45:46 home kernel: CR2: 00007f77bd771842 CR3:  
> 0000000238893000 CR4: 00000000000006e0
> May 20 12:45:46 home kernel: DR0: 0000000000000000 DR1:  
> 0000000000000000 DR2: 0000000000000000
> May 20 12:45:46 home kernel: DR3: 0000000000000000 DR6:  
> 00000000ffff0ff0 DR7: 0000000000000400
> May 20 12:45:46 home kernel: Process umount (pid: 3089, threadinfo  
> ffff81022e5be000, task ffff81023d90a7f0)
> May 20 12:45:46 home kernel: Stack:  ffff81023b4cfe38  
> ffff81023b4cfc00 ffffffff8824a900 ffff81023b4cfcd0
> May 20 12:45:46 home kernel:  ffff81023b01d3c0 ffffffff80295907  
> ffff81023b4cfc00 ffffffff80286e05
> May 20 12:45:46 home kernel:  0000000000000000 0000000000000010  
> ffff81023b4cfc00 ffffffff80286eff
> May 20 12:45:46 home kernel: Call Trace:
> May 20 12:45:46 home kernel:  [<ffffffff80295907>] ?  
> shrink_dcache_for_umount+0x2f/0x3d
> May 20 12:45:46 home kernel:  [<ffffffff80286e05>] ?  
> generic_shutdown_super+0x19/0xec
> May 20 12:45:46 home kernel:  [<ffffffff80286eff>] ? kill_anon_super 
> +0x9/0x31
> May 20 12:45:46 home kernel:   
> [<ffffffff882383f4>] ? :pvfs2:pvfs2_kill_sb+0x10a/0x18d
> May 20 12:45:46 home kernel:  [<ffffffff80286fa8>] ? deactivate_super 
> +0x66/0x7e
> May 20 12:45:46 home kernel:  [<ffffffff802997d3>] ? sys_umount 
> +0x314/0x344
> May 20 12:45:46 home kernel:  [<ffffffff80256277>] ?  
> audit_syscall_entry+0x12d/0x160
> May 20 12:45:46 home kernel:  [<ffffffff80256586>] ?  
> audit_syscall_exit+0x2dc/0x2fa
> May 20 12:45:46 home kernel:  [<ffffffff8020be69>] ? tracesys+0xdc/ 
> 0xe1
> May 20 12:45:46 home kernel:
> May 20 12:45:46 home kernel:
> May 20 12:45:46 home kernel: Code: 8b 08 48 8b 43 10 48 85 c0 74 04  
> 48 8b 50 40 48 8d 86 38 02 00 00 48 c7 c7 de 42
> 44 80 48 89 de 48 89 04 24 31 c0 e8 09 86 f9 ff <0f> 0b eb fe 48 8b  
> 6b 28 48 39 dd 75 04 31 ed eb 04 f0 ff 4d 00
> May 20 12:45:46 home kernel: RIP  [<ffffffff80295080>]  
> shrink_dcache_for_umount_subtree+0x123/0x1f5
> May 20 12:45:46 home kernel:  RSP <ffff81022e5bfdd8>
> May 20 12:45:46 home kernel: ---[ end trace 5e3c2636391d1bf8 ]---
>
> _______________________________________________
> Pvfs2-users mailing list
> Pvfs2-users at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users



More information about the Pvfs2-developers mailing list