[Pvfs2-developers] pvfs2-cp performance with single client and single server

Scott Atchley atchley at myri.com
Fri Dec 29 10:43:18 EST 2006


Hi all,

What performance do you typically see with a single client and single  
server (not the same machine) with 10 Gb/s NICs?

I am using pvfs2-cp to copy a 1 GB file from the client to the  
server. The client is reading from a tmpfs mount so it does not use  
disk (I am not swapping). The server's backing store is also tmpfs. I  
set FlowBufferSizeBytes to 1 MB. With tweaking, I am seeing about 400  
MB/s.

On the same machine, if I use dd to copy from /dev/zero to /mnt/tmpfs/ 
zeros using 1 MB blocks, I get 300 MB/s for a 1 GB file.

Initially, I used the dumbest of BMI_meth_memalloc() and  
BMI_meth_memfree(), where they are simply calls to malloc() and free 
(), and I was getting about 300 MB/s. Thinking that this was the  
problem, I tinkered with mallopt() to set higher thresholds for trim  
and mmap. This added about 50 MB/s.

Next, I added pre-malloced memory on startup and I manage a list of  
these buffers. This added another 50 MB/s to get me to 400 MB/s. I  
tried playing with pvfs2-cp's -b option but performance never  
improved over the default behavior. Interestingly, on the client,  
pvfs2-cp only uses two 1 MB buffers (over and over) for the entire 1  
GB transfer. Is this intentional? Does this mean, that only one  
buffer is in flight while the other is being filled? Is there a way  
to get pvfs2-cp to use more concurrent messages?

With Lustre, I see ~675 MB/s with a single client using one thread to  
a single server. This is not going through the entire filesystem,  
however. It is simply testing the network layer. By default, though,  
Lustre will try to use 8 or 16 threads (depending on a configurable  
parameter).

Scott


More information about the Pvfs2-developers mailing list