[Pvfs2-developers] inode locking in d_revalidate

Phil Carns carns at mcs.anl.gov
Mon Jun 30 09:37:51 EDT 2008


Sam and I looked at this some more on Friday, and we don't think that 
the i_mutex lock is needed in d_revalidate at this point.  It looks like 
the main problem was really fixed by Sam's reorganization of the 
attribute copying functions.

I went ahead and backed the lock out of cvs trunk and will run some 
simul tests to make sure everything is in working order.

-Phil

Phil Carns wrote:
> I've narrowed down a deadlock situation that I need some help untangling.
> 
> I got started down this path trying to reproduce a bug that Bart 
> reported in this thread:
> http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-June/004080.html 
> 
> 
> Before I got to that point I got sidetracked with the rename06 test case 
> within LTP.  Rename06 deadlocks the file system and becomes unkillable 
> (even with kill -9).  The particular case that causes this can be 
> reproduced on the  command line with trunk as follows:
> 
>   mkdir dir1
>   mv dir1 dir1/dir2
> 
> That mv command should gracefully return an error about moving dir1 to a 
> subdirectory of itself, but it deadlocks instead.
> 
> After chasing this around a bit, I found that it is hanging on 
> dcache.c:112:
> 
> http://www.pvfs.org/fisheye/browse/PVFS/src/kernel/linux-2.6/dcache.c?r=1.41#l112 
> 
> 
> The pvfs2_inode_lock() call is locking the i_mutex variable in this 
> case.  This lock was added to solve the simul open problem that Sam ran 
> into a while back:
> 
> http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-March/003975.html 
> 
> 
> Unfortunately, in this rename case the kernel is already holding i_mutex 
> when it calls d_revalidate.  The culprit is the do_rename() function, 
> which locks the mutex within a subroutine called lock_rename():
> 
> http://lxr.linux.no/linux+v2.6.21/fs/namei.c#L2520
> 
> ... and then indirectly triggers a d_revalidate as part of lookup_hash() 
> a few lines later while still holding it:
> 
> http://lxr.linux.no/linux+v2.6.21/fs/namei.c#L2522
> 
> These steps happen as the kernel is trying to find out information about 
> the two directories in order to perform rename error checking.
> 
> Anyone have any ideas what to do?
> 
> The copy_attributes_to_inode() function (the core problem in the 
> d_revalidate/simul problem) already uses a different and apparently safe 
> lock for some things: i_lock by way of pvfs2_lock_inode().  It does not 
> currently cover the perms or mode though.  I kind of wonder if expanding 
> that lock to cover more fields in copy_attributes_to_inode() would be 
> enough to fix the simul problem, but I can't find any documentation for 
> which fields i_lock is supposed to protect compared to i_mutex.
> 
> I also apologize for there being both a pvfs2_lock_inode() and 
> pvfs2_inode_lock() macro that do different things.  I added the latter 
> recently when we needed a configure test for i_mutex changing to i_sem, 
> but I didn't notice that there was a confusingly similar macro already. 
>  We should definitely rename it :)
> 
> -Phil
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list