[Pvfs2-developers] inode locking in d_revalidate

Phil Carns carns at mcs.anl.gov
Fri Jun 27 15:36:14 EDT 2008


I've narrowed down a deadlock situation that I need some help untangling.

I got started down this path trying to reproduce a bug that Bart 
reported in this thread:
http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-June/004080.html

Before I got to that point I got sidetracked with the rename06 test case 
within LTP.  Rename06 deadlocks the file system and becomes unkillable 
(even with kill -9).  The particular case that causes this can be 
reproduced on the  command line with trunk as follows:

   mkdir dir1
   mv dir1 dir1/dir2

That mv command should gracefully return an error about moving dir1 to a 
subdirectory of itself, but it deadlocks instead.

After chasing this around a bit, I found that it is hanging on dcache.c:112:

http://www.pvfs.org/fisheye/browse/PVFS/src/kernel/linux-2.6/dcache.c?r=1.41#l112

The pvfs2_inode_lock() call is locking the i_mutex variable in this 
case.  This lock was added to solve the simul open problem that Sam ran 
into a while back:

http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-March/003975.html

Unfortunately, in this rename case the kernel is already holding i_mutex 
when it calls d_revalidate.  The culprit is the do_rename() function, 
which locks the mutex within a subroutine called lock_rename():

http://lxr.linux.no/linux+v2.6.21/fs/namei.c#L2520

... and then indirectly triggers a d_revalidate as part of lookup_hash() 
a few lines later while still holding it:

http://lxr.linux.no/linux+v2.6.21/fs/namei.c#L2522

These steps happen as the kernel is trying to find out information about 
the two directories in order to perform rename error checking.

Anyone have any ideas what to do?

The copy_attributes_to_inode() function (the core problem in the 
d_revalidate/simul problem) already uses a different and apparently safe 
lock for some things: i_lock by way of pvfs2_lock_inode().  It does not 
currently cover the perms or mode though.  I kind of wonder if expanding 
that lock to cover more fields in copy_attributes_to_inode() would be 
enough to fix the simul problem, but I can't find any documentation for 
which fields i_lock is supposed to protect compared to i_mutex.

I also apologize for there being both a pvfs2_lock_inode() and 
pvfs2_inode_lock() macro that do different things.  I added the latter 
recently when we needed a configure test for i_mutex changing to i_sem, 
but I didn't notice that there was a confusingly similar macro already. 
  We should definitely rename it :)

-Phil








More information about the Pvfs2-developers mailing list