[Pvfs2-developers] duplicate entries in directory listing

Sam Lang slang at mcs.anl.gov
Mon Oct 9 11:58:37 EDT 2006


This has got to caused by the way I did the caching of positions on  
the server.  I think it might make sense to replace that with code  
that uses the component name as the position, instead of trying to  
debug this problem.  I feel like the caching is the inherent problem  
that causes these bugs, and using the component name should solve them.

We talked about using the component name as the position back in july  
during the futures meeting, and IIRC we left it at the problem of the  
kernel module not always having enough space for all the entries  
returned, and so the position would have to be modified on the client  
somehow.  Then we sort of went off and discussed operators on  
positions...

I talked with Murali about this again recently, and it sounds like we  
can grab previous component names to use that as the position, so  
maybe that's the way to go.  I'll look into doing that and see how  
much work it is.

-sam

On Oct 9, 2006, at 10:14 AM, Phil Carns wrote:

> We are seeing a strange bug where if we list the contents of a  
> directory
> while files are being created in it, we sometimes get duplicates  
> and/or
> missing files in the output.
>
> I can reproduce it on a single machine by running these two scripts at
> the same time:
>
> tester.sh:
> -----------------------------------
>    #!/bin/tcsh
>
>    foreach file ( `seq 1 10000` )
>       touch /mnt/pvfs2/testdir/${file}
>    end
>
> watcher.sh:
> -----------------------------------
>    #!/bin/tcsh
>
>    there:
>       set foo=`ls /mnt/pvfs2/testdir | wc -l`
>       set bar=`ls /mnt/pvfs2/testdir | uniq -d | wc -l`
>       echo listing count: $foo, duplicates: $bar
>       sleep 1
>    goto there
>
>
> The test machine that I am using is pretty slow.  On faster  
> machines you
> may need to create more than 10,000 files, or maybe slow it down by
> actually writing a little bit of data into each file.
>
> At any rate, the output looks normal for a while, but then we start
> seeing results like this from watcher.sh:
>
> ...
> listing count: 6310, duplicates: 0
> listing count: 6320, duplicates: 0
> listing count: 6334, duplicates: 0
> listing count: 6371, duplicates: 0
> listing count: 6382, duplicates: 0
> listing count: 6396, duplicates: 5024
> listing count: 6406, duplicates: 0
> listing count: 10896, duplicates: 5344
> listing count: 6430, duplicates: 5120
> listing count: 6434, duplicates: 0
> listing count: 11574, duplicates: 6048
> listing count: 6472, duplicates: 0
> ...
>
> The listing count is supposed to steadily increase, and the duplicates
> field should always be zero.  The problem only occurs while files are
> being created.  Once tester.sh is done, the listing looks perfectly
> normal.
>
> Anyone have any ideas?  I think this problem has been hanging  
> around for
> a little while but we just now figured out how to reliably trigger  
> it. It is at least in current cvs head and was in a snapshot from  
> August 21.
>
> -Phil
>
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list