[Pvfs2-developers] server crash on startup with millions of files
Sam Lang
slang at mcs.anl.gov
Tue Feb 20 13:37:33 EST 2007
On Feb 20, 2007, at 11:23 AM, Phil Carns wrote:
> Robert Latham wrote:
>> On Tue, Feb 20, 2007 at 07:29:16AM -0500, Phil Carns wrote:
>>> Oh, and one other detail; the memory usage of the servers looks
>>> fine during startup, so this doesn't appear to be a memory leak.
>>> There is quite a bit of CPU work, but I am guessing that is just
>>> berkeley db keeping busy in the iteration function.
>> How long does it take to scan 1.4 million files on startup?
>> ==rob
>
> That's an interesting issue :)
>
> A few observations:
>
> - we were looking at this on SAN; the results may be different on
> local disks
>
> - the db files are on the order of 500 MB for this particular setup
>
> - the time to scan varies depending on if the db files are hot in
> the Linux buffer cache
>
> If we start the daemon right after killing another one that just
> did the same scan, then the process is CPU intensive, but fast
> (about 5 seconds). If we unmount/mount the SAN between the two
> runs so that the buffer cache is cleared, then it is very slow
> (about 5 minutes).
>
> An interesting trick is to use dd with a healthy buffer size to
> read the .db files and throw the output into /dev/null before
> starting the servers. This only takes a few seconds, and makes it
> so that the scan consistently finishes in just a few seconds as
> well. I think the reason is just that it forces the db data into
> the Linux buffer cache using an efficient access pattern so that
> berkeley db doesn't have to wait on disk latency for whatever small
> accesses it is performing.
>
> This seems to indicate that berkeley db's access pattern generated
> by PVFS2 for this case isn't very friendly, at least to SANs that
> aren't specifically tuned for it.
>
> The 5 minute scan time is a problem, because it makes it hard to
> tell when you will actually be able to mount the file system after
> the daemons appear to have started. We would be happy to try out
> any optimizations here :)
We might try using GET_MULTIPLE for the iterate cases. Hopefully
berkeley uses large block sizes for the reads in that case.
-sam
>
> -Phil
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list