[Pvfs2-developers] inode locking in d_revalidate
Phil Carns
carns at mcs.anl.gov
Mon Jun 30 09:37:51 EDT 2008
Sam and I looked at this some more on Friday, and we don't think that
the i_mutex lock is needed in d_revalidate at this point. It looks like
the main problem was really fixed by Sam's reorganization of the
attribute copying functions.
I went ahead and backed the lock out of cvs trunk and will run some
simul tests to make sure everything is in working order.
-Phil
Phil Carns wrote:
> I've narrowed down a deadlock situation that I need some help untangling.
>
> I got started down this path trying to reproduce a bug that Bart
> reported in this thread:
> http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-June/004080.html
>
>
> Before I got to that point I got sidetracked with the rename06 test case
> within LTP. Rename06 deadlocks the file system and becomes unkillable
> (even with kill -9). The particular case that causes this can be
> reproduced on the command line with trunk as follows:
>
> mkdir dir1
> mv dir1 dir1/dir2
>
> That mv command should gracefully return an error about moving dir1 to a
> subdirectory of itself, but it deadlocks instead.
>
> After chasing this around a bit, I found that it is hanging on
> dcache.c:112:
>
> http://www.pvfs.org/fisheye/browse/PVFS/src/kernel/linux-2.6/dcache.c?r=1.41#l112
>
>
> The pvfs2_inode_lock() call is locking the i_mutex variable in this
> case. This lock was added to solve the simul open problem that Sam ran
> into a while back:
>
> http://www.beowulf-underground.org/pipermail/pvfs2-developers/2008-March/003975.html
>
>
> Unfortunately, in this rename case the kernel is already holding i_mutex
> when it calls d_revalidate. The culprit is the do_rename() function,
> which locks the mutex within a subroutine called lock_rename():
>
> http://lxr.linux.no/linux+v2.6.21/fs/namei.c#L2520
>
> ... and then indirectly triggers a d_revalidate as part of lookup_hash()
> a few lines later while still holding it:
>
> http://lxr.linux.no/linux+v2.6.21/fs/namei.c#L2522
>
> These steps happen as the kernel is trying to find out information about
> the two directories in order to perform rename error checking.
>
> Anyone have any ideas what to do?
>
> The copy_attributes_to_inode() function (the core problem in the
> d_revalidate/simul problem) already uses a different and apparently safe
> lock for some things: i_lock by way of pvfs2_lock_inode(). It does not
> currently cover the perms or mode though. I kind of wonder if expanding
> that lock to cover more fields in copy_attributes_to_inode() would be
> enough to fix the simul problem, but I can't find any documentation for
> which fields i_lock is supposed to protect compared to i_mutex.
>
> I also apologize for there being both a pvfs2_lock_inode() and
> pvfs2_inode_lock() macro that do different things. I added the latter
> recently when we needed a configure test for i_mutex changing to i_sem,
> but I didn't notice that there was a confusingly similar macro already.
> We should definitely rename it :)
>
> -Phil
>
>
>
>
>
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
More information about the Pvfs2-developers
mailing list