[Pvfs2-developers] Listing performance patch
Rob Ross
rross at mcs.anl.gov
Fri Sep 12 05:47:07 EDT 2008
Excellent. Thanks Bart, all! -- Rob
On Sep 11, 2008, at 9:32 AM, Phil Carns wrote:
> Hi Bart,
>
> I fixed a silly bug in our readdir logic just now, and now your
> patch works fine for the case I was looking at. I applied the
> dirent increase patch to trunk.
>
> I now get the correct number of getdents calls (using ext3 for
> comparison) on PVFS:
>
> getdents64(3, /* 170 entries */, 4096) = 4080
> getdents64(3, /* 132 entries */, 4096) = 3168
> getdents64(3, /* 0 entries */, 4096) = 0
>
> So even with just 300 entries your patch takes us from 11 getdents
> system calls down to 3 to do an ls.
>
> Thanks!
> -Phil
>
> Phil Carns wrote:
>> I looked at the code a little just now. The getdents system call
>> passes a filldir() callback function into the file system readdir()
>> implementation that lets it fill entries until the user's dentry
>> buffer is full. The dentries at this level use variable length
>> strings. The only remaining cap at this point is the size of the
>> dentry buffer passed in from user space (and any artificial cap
>> introduced by the file system implementation).
>> http://lxr.linux.no/linux+v2.6.26.5/fs/readdir.c#L270
>> http://lxr.linux.no/linux+v2.6.26.5/fs/readdir.c#L232
>> If I do an strace on a directory with 300 entries on ext3, this is
>> what happens:
>> getdents64(3, /* 170 entries */, 4096) = 4080
>> getdents64(3, /* 132 entries */, 4096) = 3168
>> getdents64(3, /* 0 entries */, 4096) = 0
>> If I do the same thing on a PVFS volume, this is what happens:
>> getdents64(3, /* 34 entries */, 4096) = 816
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 12 entries */, 4096) = 288
>> getdents64(3, /* 0 entries */, 4096) = 0
>> The latter is not filling up the getdents buffer because our code
>> is stopping at 32 entries per iteration. If I then apply Bart's
>> patch, things improve in terms of how much it fits into one
>> getdents system call, but on my box at least (2.6.24-19, 32bit,
>> current PVFS trunk) something new breaks:
>> getdents64(3, /* 170 entries */, 4096) = 4080
>> getdents64(3, /* 0 entries */, 4096) = 0
>> It looks like it stopped after one getdents (the actual output from
>> ls only shows 170 entries).
>> So... I would like to apply this patch, but first I need to dig a
>> little more and find out what the bug is on my system that is
>> making it stop at the first getdents call. It must not be handling
>> the token right in the case where PVFS returns more entries than
>> filldir() can consume.
>> -Phil
>> Rob Ross wrote:
>>> Has the internal kernel value changed since we last looked?
>>>
>>> Rob
>>>
>>> On Sep 4, 2008, at 4:16 PM, Phil Carns wrote:
>>>
>>>> Sam Lang wrote:
>>>>> Hi Bart,
>>>>> Thanks for the patch. For users with that many files in a
>>>>> directory, using pvfs2-ls is probably a good alternative.
>>>>> The kernel does readdir requests 32 entries at a time, so
>>>>> increasing MAX_NUM_DIRENTS won't help for ls. Long listings
>>>>> requires getting the size of files, which in PVFS is fairly
>>>>> expensive.
>>>>> Unfortunately, we haven't kept up with the readdirplus
>>>>> implementation, some bugs have probably crept in since Murali
>>>>> added that tool. If you were motivated to look at where the
>>>>> servers were crashing, we'd certainly be interested in helping
>>>>> with the debugging there.
>>>>> Thanks again,
>>>>> -sam
>>>>
>>>> It does look like ls improved with the patches for some reason,
>>>> though.
>>>>
>>>> The 256 and 512 results are also just about close enough to be
>>>> noise. It looks like most of the benefit came from the jump from
>>>> 32/64 to 256.
>>>>
>>>> -Phil
>>>> _______________________________________________
>>>> Pvfs2-developers mailing list
>>>> Pvfs2-developers at beowulf-underground.org
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-
>>>> developers
>>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
More information about the Pvfs2-developers
mailing list