[Pvfs2-developers] Listing performance patch
Phil Carns
carns at mcs.anl.gov
Mon Nov 3 13:12:57 EST 2008
Hi David,
Thanks for the bug report. I don't know off hand how the dirent count
could cause the ls utility to consume so much memory, but I'll see if I
can reproduce it here.
thanks,
-Phil
David Metheny wrote:
> It appears that increasing the "MAX_DIRENT_COUNT" in the
> src/kernel/linux2.6/pvfs2-dev-proto.h file has turned out to be a bad thing
> for us. We had implemented this to be 96 also, and found some issues in some
> stress testing.
>
> We've hit a scenario where a single directory on our file system contained >
> 800,000 files/directories, with many directories containing 10,000+ files
> each. When we executed 'ls -Rl' on the top level directory, after about 8
> hours, the 'ls' command was consuming 800MB+ memory and eventually exited
> with a "memory exhausted" error. We definitely have some paths that are long
> enough that 96 of them won't fit into a single 4K page.
>
> We backed out only the "MAX_DIRENT_COUNT" in the
> src/kernel/linux2.6/pvfs2-dev-proto.h and put it back at 0x00000020 (32) and
> reran the test. The 'ls -Rl' consistently runs in about an hour now, and
> finishes correctly.
>
>
> -----Original Message-----
> From: pvfs2-developers-bounces at beowulf-underground.org
> [mailto:pvfs2-developers-bounces at beowulf-underground.org] On Behalf Of Phil
> Carns
> Sent: Thursday, September 11, 2008 9:33 AM
> To: Bart Taylor
> Cc: pvfs2-developers at beowulf-underground.org
> Subject: Re: [Pvfs2-developers] Listing performance patch
>
> Hi Bart,
>
> I fixed a silly bug in our readdir logic just now, and now your patch
> works fine for the case I was looking at. I applied the dirent increase
> patch to trunk.
>
> I now get the correct number of getdents calls (using ext3 for
> comparison) on PVFS:
>
> getdents64(3, /* 170 entries */, 4096) = 4080
> getdents64(3, /* 132 entries */, 4096) = 3168
> getdents64(3, /* 0 entries */, 4096) = 0
>
> So even with just 300 entries your patch takes us from 11 getdents
> system calls down to 3 to do an ls.
>
> Thanks!
> -Phil
>
> Phil Carns wrote:
>> I looked at the code a little just now. The getdents system call passes
>> a filldir() callback function into the file system readdir()
>> implementation that lets it fill entries until the user's dentry buffer
>> is full. The dentries at this level use variable length strings. The
>> only remaining cap at this point is the size of the dentry buffer passed
>> in from user space (and any artificial cap introduced by the file system
>> implementation).
>>
>> http://lxr.linux.no/linux+v2.6.26.5/fs/readdir.c#L270
>> http://lxr.linux.no/linux+v2.6.26.5/fs/readdir.c#L232
>>
>> If I do an strace on a directory with 300 entries on ext3, this is what
>> happens:
>>
>> getdents64(3, /* 170 entries */, 4096) = 4080
>> getdents64(3, /* 132 entries */, 4096) = 3168
>> getdents64(3, /* 0 entries */, 4096) = 0
>>
>> If I do the same thing on a PVFS volume, this is what happens:
>>
>> getdents64(3, /* 34 entries */, 4096) = 816
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 32 entries */, 4096) = 768
>> getdents64(3, /* 12 entries */, 4096) = 288
>> getdents64(3, /* 0 entries */, 4096) = 0
>>
>> The latter is not filling up the getdents buffer because our code is
>> stopping at 32 entries per iteration. If I then apply Bart's patch,
>> things improve in terms of how much it fits into one getdents system
>> call, but on my box at least (2.6.24-19, 32bit, current PVFS trunk)
>> something new breaks:
>>
>> getdents64(3, /* 170 entries */, 4096) = 4080
>> getdents64(3, /* 0 entries */, 4096) = 0
>>
>> It looks like it stopped after one getdents (the actual output from ls
>> only shows 170 entries).
>>
>> So... I would like to apply this patch, but first I need to dig a little
>> more and find out what the bug is on my system that is making it stop at
>> the first getdents call. It must not be handling the token right in the
>> case where PVFS returns more entries than filldir() can consume.
>>
>> -Phil
>>
>>
>> Rob Ross wrote:
>>> Has the internal kernel value changed since we last looked?
>>>
>>> Rob
>>>
>>> On Sep 4, 2008, at 4:16 PM, Phil Carns wrote:
>>>
>>>> Sam Lang wrote:
>>>>> Hi Bart,
>>>>> Thanks for the patch. For users with that many files in a
>>>>> directory, using pvfs2-ls is probably a good alternative.
>>>>> The kernel does readdir requests 32 entries at a time, so increasing
>>>>> MAX_NUM_DIRENTS won't help for ls. Long listings requires getting
>>>>> the size of files, which in PVFS is fairly expensive.
>>>>> Unfortunately, we haven't kept up with the readdirplus
>>>>> implementation, some bugs have probably crept in since Murali added
>>>>> that tool. If you were motivated to look at where the servers were
>>>>> crashing, we'd certainly be interested in helping with the debugging
>>>>> there.
>>>>> Thanks again,
>>>>> -sam
>>>> It does look like ls improved with the patches for some reason, though.
>>>>
>>>> The 256 and 512 results are also just about close enough to be noise.
>>>> It looks like most of the benefit came from the jump from 32/64 to 256.
>>>>
>>>> -Phil
>>>> _______________________________________________
>>>> Pvfs2-developers mailing list
>>>> Pvfs2-developers at beowulf-underground.org
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list