[PVFS-users] PVFS API Performance with Multiple Processes

Robert Latham robl at mcs.anl.gov
Tue Dec 7 10:12:37 EST 2004


On Tue, Dec 07, 2004 at 06:14:25AM -0800, pvfstester at fastmail.com.au wrote:
> I am testing out an existing piece of software which was written using
> standard UNIX I/O and which does all of its reading and writing in large
> blocks (multiple MB per read/write). The application is generally I/O
> bound, and I am using a fairly large (60GB) data file to perform my
> tests. During the course of execution, the application reads the entire
> input file, performs some manipulation of the data, writes the file back
> out to temporary several files a bit at a time, reads the temporary
                                   ^^^^^^^^^^^^^
> files back in, manipulates the data some more, and writes the data out
> to the final output file.

Talk about a worst-case workload for pvfs1 --  or any other parallel or
distributed file system. 

As you've seen when you wrote these temporary files to local disc,
single-byte writes are particularly bad for pvfs1.   We do not have
client side caching, so each byte requires a network round trip to an
io server.  NFS caches these kinds of writes on the client side, but
as a result, introduces consistency semantics that just won't work for
many parallel applications.

But you said earlier in that paragraph that you read and write in
multi-megabyte chunks, so perhaps i'm misunderstanding "bit at a
time". 

Thanks
==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B


More information about the PVFS-users mailing list