[Pvfs2-developers] help with kernel dentry revalidate problem

Sam Lang slang at mcs.anl.gov
Mon Oct 8 20:11:21 EDT 2007


Hi Murali,

I was able to verify that your latest patch fixes the problem with  
the simul test #7, so I went ahead and committed it.

Also, when the problem actually existed, running simul #7 a bunch and  
then trying to unload the kernel module was giving an error:

[  806.396608] slab error in kmem_cache_destroy(): cache  
`pvfs2_op_cache': Can't free all objects
[  806.413800].
[  806.413801] Call Trace:
[  806.413820]  [<ffffffff802d51a5>] kmem_cache_destroy+0x95/0xe0
[  806.413832]  [<ffffffff8850eeb9>] :pvfs2:pvfs_kmem_cache_destroy 
+0x9/0x10
[  806.413838]  [<ffffffff8850eff0>] :pvfs2:op_cache_finalize+0x10/0x40
[  806.413845]  [<ffffffff88519695>] :pvfs2:pvfs2_exit+0x155/0x185
[  806.413850]  [<ffffffff802ac740>] sys_delete_module+0x1b0/0x200
[  806.413861]  [<ffffffff8026111e>] system_call+0x7e/0x83
[  806.413868].
[  806.413870] pvfs2: module version 2.7.0pre1-2007-10-08-204358  
unloaded


I did some debugging and it looks like the op cache entry that wasn't  
getting release was from a lookup, and it looks like there's a case  
where lookup can return an error that's not ENOENT, and the op entry  
doesn't get released.  I've attached a patch that I think fixes the  
problem.  Can you verify that this looks ok?  Also, I've seen this  
error before on other systems (I think Pete has too) and I'm not sure  
its always from this one case.  Is there a good way to verify that  
we're releasing ops (and possibly other cache entries)  
appropriately?  Just looking for ideas to harden the code in the kmod.

-sam

-------------- next part --------------
A non-text attachment was scrubbed...
Name: namei.patch
Type: application/octet-stream
Size: 554 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20071008/7dcc6b3d/namei.obj
-------------- next part --------------

On Aug 16, 2007, at 12:59 PM, Murali Vilayannur wrote:

> Kevin,
> Instead of the call to d_add(), can you replace it by a
> pvfs2_d_splice_alias() with the same parameters as before and
> recompile/reload and see if that fixes the crash.
> Something like the attached..
> thanks,
> Murali
>
> On 8/16/07, Kevin Harms <harms at alcf.anl.gov> wrote:
>> Murali,
>>
>>         i tried the patch. (applied it to 2.6.3 source) it get  
>> crashes from
>> on one of the machines.
>>         i send you an email with dmesg output.
>>
>> Kevin
>>
>> <dcache.patch>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list