[Pvfs2-developers] AltAio vs default on jazz

Rob Ross rross at mcs.anl.gov
Wed Nov 8 22:54:33 EST 2006


it would be worth placing an upper bound on the # of concurrent 
operations to see if this has an impact on the 128 client case; could be 
that the servers just don't do that well when there are that many 
concurrent operations. i note that the 32 server case is the only 
alt-aio read case that *didn't* have poor performance at 128 clients, so 
this seems like a reasonable approach.

rob

Robert Latham wrote:
> Another round of those mpi-io-test runs on jazz have finally made it
> through the queue:
> 
> http://www.mcs.anl.gov/~robl/pvfs2/jazz/20061012/
> http://www.mcs.anl.gov/~robl/pvfs2/jazz/20061012-altio
> 
> Those directories have plots for write (including time to sync after
> write), plots for read (reading from a hot VFS buffer cache) and plots
> of the min/max/avg for each case (to see if wierd dips and spikes in
> the graph are one-offs or consistent).
> 
> Some items of interest:
> 
> - Jazz is old.  It's thread library is the old linuxthreads stuff.
> 
> - The reads are coming out of VFS cache, so I hypothesized there would
>   be not much difference in trove methods for reads.  That hypothesis
>   was incorrect.  I would have expected the different Trove methods to
>   matter more for writes, but that too was incorrect, probalby because
>   MPI_File_sync calls PVFS_sys_flush from one processor.  The 128
>   server case is worth noting: see below
> 
> - Even though the thread library isn't ideal, AltAio gives better peak
>   performance and  more consistent results
> 
> - Strangely, AltAio takes a bad turn when 128 clients read from 64 or 128
>   servers
> 
> - Even more strangely, AltAio for 128 servers does very poorly.  This
>   could be an artifact of the way we implement MPI_File_sync: as the
>   number of clients increases, there is more data to flush to disk.  I
>   don't know why AltAio would handle this worse than default, though.
>   Certainly, there is the same amount of data when 128 clients write
>   to 64 or 32 servers.  why is 128 so bad?   
> 
> - It's hard to see from the scale, but AltAio for writes reaches the
>   plateau faster than default. 
> 
> - AltAio bandwitdhs are much more consistent than the default: the
>   candlesticks for the most part are much smaller in the altaoi case.
> 
> So, based on these numbers, I'd be tempted to make AltAio the default.
> I just wish I could explain why performance drops off for this
> benchmark with higher numbers of servers and clients. 
> 
> ==rob
> 


More information about the Pvfs2-developers mailing list