[PVFS-users] compiling libpvfs.so.1.4

Phil Carns pcarns@hubcap.clemson.edu
Wed, 2 Aug 2000 14:40:17 -0400 (EDT)


> Phil,
> 	You mention a performance penalty in using the kernel interface here. 
> Do you mind elaborating a bit on this?  What does the roadmap of
> development hold in terms of performance?  THanks.

This is a rather long winded description, but it will probably be good to
archive these points on the mailing list :)

The kernel interface inherently causes a slight performance penalty
for two main reasons.  First it requires an extra memory buffer transfer
to push data between the application, kernel space, and pvfsd (client
daemon) before the data actually hits the network.  In contrast, if the
library handles the transfer, the data is moved directly in and out of
application buffers.  Some of this overhead can be cut down by using the
kernel patch for raw I/O based memory region transfers, but not all of it.

Secondly, there is a slight penalty just for the context switch of
handling file system operations in the kernel.  The impact of this is
probably not as big as item 1, but I think it is an issue.  Again, if you
are using the library there is no context switch involved- the operations
are carried out directly in the application by the library code.

The performance tradeoff is mostly noticable when doing small file
operations.  I guess it boils down to more of a latency issue.  On small
operations the amount of work being done isn't big enough compared to the
cost of the memory copy and context switch to make up for it, and the
result is lower performance on such operations.  On larger file
operations, this is not nearly as evident because the overhead is
amortized over a larger transfer.

On the other hand, the kernel interface is a much more reasonable way to
provide unix file system compatibility.  It allows us to provide all of
the functionality expected of a compatible file system, such as the
ability to run executables, shell redirection, and better thread
safety.  It is also easier to maintain because it is based on a well
defined interface, rather than something that requires quite a bit of
effort to update with each new glibc release.

In either case, if you are after absolute raw performance rather than
compatibility, the native pvfs interface library and the Romio MPI-IO
interface are always going to be the preferred path to parrallel I/O
performance.  They can express parallel file access much better because
they are not bound to the compatibility requirements of the traditional
unix file access interface.  The unix file interface just wasn't meant for
parallel access, unfortunately.

As far as the future of pvfs-kernel performance goes, things are going to
get better over time.  For one thing, the raw I/O transfer method will
become the default transfer method under the 2.4 kernel, because that
functionality has been integrated into the new kernel and will no longer
require a patch.  There are other less obvious optimizations that can be
made as well when development time allows (some of these are listed in the
README, others have become apparent as more users run PVFS and report
their findings to us).

-Phil