[Pvfs2-developers] Listing performance patch

Rob Ross rross at mcs.anl.gov
Fri Sep 12 05:47:07 EDT 2008


Excellent. Thanks Bart, all! -- Rob

On Sep 11, 2008, at 9:32 AM, Phil Carns wrote:

> Hi Bart,
>
> I fixed a silly bug in our readdir logic just now, and now your  
> patch works fine for the case I was looking at.  I applied the  
> dirent increase patch to trunk.
>
> I now get the correct number of getdents calls (using ext3 for  
> comparison) on PVFS:
>
> getdents64(3, /* 170 entries */, 4096)  = 4080
> getdents64(3, /* 132 entries */, 4096)  = 3168
> getdents64(3, /* 0 entries */, 4096)    = 0
>
> So even with just 300 entries your patch takes us from 11 getdents  
> system calls down to 3 to do an ls.
>
> Thanks!
> -Phil
>
> Phil Carns wrote:
>> I looked at the code a little just now.  The getdents system call  
>> passes a filldir() callback function into the file system readdir()  
>> implementation that lets it fill entries until the user's dentry  
>> buffer is full.  The dentries at this level use variable length  
>> strings.  The only remaining cap at this point is the size of the  
>> dentry buffer passed in from user space (and any artificial cap  
>> introduced by the file system implementation).
>> http://lxr.linux.no/linux+v2.6.26.5/fs/readdir.c#L270
>> http://lxr.linux.no/linux+v2.6.26.5/fs/readdir.c#L232
>> If I do an strace on a directory with 300 entries on ext3, this is  
>> what happens:
>> getdents64(3, /* 170 entries */, 4096)  = 4080
>> getdents64(3, /* 132 entries */, 4096)  = 3168
>> getdents64(3, /* 0 entries */, 4096)    = 0
>> If I do the same thing on a PVFS volume, this is what happens:
>> getdents64(3, /* 34 entries */, 4096)   = 816
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 32 entries */, 4096)   = 768
>> getdents64(3, /* 12 entries */, 4096)   = 288
>> getdents64(3, /* 0 entries */, 4096)    = 0
>> The latter is not filling up the getdents buffer because our code  
>> is stopping at 32 entries per iteration.  If I then apply Bart's  
>> patch, things improve in terms of how much it fits into one  
>> getdents system call, but on my box at least (2.6.24-19, 32bit,  
>> current PVFS trunk) something new breaks:
>> getdents64(3, /* 170 entries */, 4096)  = 4080
>> getdents64(3, /* 0 entries */, 4096)    = 0
>> It looks like it stopped after one getdents (the actual output from  
>> ls only shows 170 entries).
>> So... I would like to apply this patch, but first I need to dig a  
>> little more and find out what the bug is on my system that is  
>> making it stop at the first getdents call.  It must not be handling  
>> the token right in the case where PVFS returns more entries than  
>> filldir() can consume.
>> -Phil
>> Rob Ross wrote:
>>> Has the internal kernel value changed since we last looked?
>>>
>>> Rob
>>>
>>> On Sep 4, 2008, at 4:16 PM, Phil Carns wrote:
>>>
>>>> Sam Lang wrote:
>>>>> Hi Bart,
>>>>> Thanks for the patch.  For users with that many files in a  
>>>>> directory, using pvfs2-ls is probably a good alternative.
>>>>> The kernel does readdir requests 32 entries at a time, so  
>>>>> increasing MAX_NUM_DIRENTS won't help for ls.  Long listings  
>>>>> requires getting the size of files, which in PVFS is fairly  
>>>>> expensive.
>>>>> Unfortunately, we haven't kept up with the readdirplus  
>>>>> implementation, some bugs have probably crept in since Murali  
>>>>> added that tool.  If you were motivated to look at where the  
>>>>> servers were crashing, we'd certainly be interested in helping  
>>>>> with the debugging there.
>>>>> Thanks again,
>>>>> -sam
>>>>
>>>> It does look like ls improved with the patches for some reason,  
>>>> though.
>>>>
>>>> The 256 and 512 results are also just about close enough to be  
>>>> noise. It looks like most of the benefit came from the jump from  
>>>> 32/64 to 256.
>>>>
>>>> -Phil
>>>> _______________________________________________
>>>> Pvfs2-developers mailing list
>>>> Pvfs2-developers at beowulf-underground.org
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2- 
>>>> developers
>>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list