[Pvfs2-developers] help with kernel dentry revalidate problem

Phil Carns pcarns at wastedcycles.org
Wed Aug 15 20:50:43 EDT 2007


You might want to try repeating the test with the pvfs2-client set to 
disable the ncache and acache (set the timeout to zero either in proc or 
with command line arguments).  I don't know if they are playing any role 
or not, but it may at least simplify the debugging a little.

-Phil

Pete Wyckoff wrote:
> Sam and I have been tracking down a pvfs bug when using the VFS
> interface.  Kevin discovered it.
> 
> The code is test #7 in simul.  It runs in parallel, four tasks,
> tasks 0 and 1 on node1 and tasks 2 and 3 on node2.  It does:
>     
>     if (task == 0)
> 	mkdir("foo");
> 
>     MPI_Barrier();
>     sleep(3);
> 
>     stat("foo");
> 
>     if (task == 0)
>     	rmdir("foo");
> 
> On a freshly initialized pvfs (1 server for both md + io), it works.
> Task 0 creates the directory, and all four tasks stat it
> successfully.  When the process exits, the directory is indeed gone.
> 
> The second time you run it, tasks 2 and 3 (on node2) get -ENOENT
> from the stat, but tasks 0 and 1 work fine as before and the
> directory was indeed created properly.
> 
> Looking down a bit further, the server sees lookup requests from
> tasks 2 and 3, and returns the proper handle Id.  Then it sees
> getattr requests from tasks 2 and 3 for the handle ID that the
> directory had on the first run, not the handle ID for this run.
> 
> We may have traced this to the kernel module.  Some of the log looks
> like this:
> 
>     pvfs2_d_revalidate_common: called on dentry ffff81003df86970.
>     pvfs2_d_revalidate_common: parent found.
>     pvfs2_d_revalidate_common: attempting lookup.
>     Alloced OP (ffff81003d98a1f8: 121 OP_LOOKUP)
>     pvfs2: service_operation: pvfs2_lookup ffff81003d98a1f8
>     client-core: reading op tag 120 OP_LOOKUP
>     client-core: reading op tag 121 OP_LOOKUP
>     (get) Alloced OP (ffff81003df561b8:120)
>     (get) Alloced OP (ffff81003d98a1f8:121)
>     pvfs2: service_operation pvfs2_lookup returning: 0 for
>     ffff81003df561b8.
>     pvfs2_d_revalidate_common: lookup failure or no match.
>     Releasing OP (ffff81003df561b8: 120)
>     pvfs2_getattr: called on simul_dir_stat.0
>     pvfs2_inode_getattr: called on inode 1048471
> 
> Something calls revalidate on the dentry.  The lookup returns
> successful from userspace.  The kernel sees that the handles are
> different:
> 
>             if((new_op->downcall.status != 0) ||
>                     !match_handle(new_op->downcall.resp.lookup.refn.handle, inode))
>             {
>                 gossip_debug(GOSSIP_DCACHE_DEBUG, "pvfs2_d_revalidate_common: lookup failure or no match.\n");
>                 op_release(new_op);
>                 return(0);
>             }
> 
> But then immediatly issues a getattr for the old handle ID.  Anybody
> know how to fix or destroy the bad dentry?  (Looks at Murali...)
> 
> 		-- Pete
> 
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list