[Pvfs2-developers] concurrent ls and rm

Sam Lang slang at mcs.anl.gov
Wed Sep 5 11:13:46 EDT 2007


On Sep 5, 2007, at 10:31 AM, Phil Carns wrote:

> We have run into a problem with running "rm -rf" and "ls"  
> concurrently on the same directory from different client nodes.  In  
> the particular case that we are looking at, the directory has about  
> 7000 files in it but no subdirectories.  If we do an ls on the  
> directory while an "rm -rf" is running from a different client,  
> then the rm fails to remove all of the files.  It seems to get  
> worse if you do more than one ls while the rm is working.  This is  
> on RHEL4 with 2.6.9.something kernels.
>
> Has anyone else seen this?  Any idea what the problem is?

Hi Phil,

The trove layer caches the position -> name mapping for positions it  
returns back to the client on a readdir.  The problem is probably  
related to caching those entries, where the readdir for the rm is  
iterating over the directory, and so inserting position -> name  
entries into the cache, and then ls is coming along and replacing  
those entries with its own, where the position is the same but the  
name is further down in the directory (because rm has removed some of  
them).  That's just a guess though.  You could see if disabling that  
position cache helps fix the problem, disabling it will cause the  
berkeley db iterate to walk through all the entries up to the  
position though, so its going to be much slower.  The position cache  
is in dbpf-keyval-pcache.c.

Probably the right long term solution is to return the name as the  
position, instead of an int.

-sam



>
> thanks,
> -Phil
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list