[Pvfs2-developers] Suggested testing for bmi_mx?

Scott Atchley atchley at myri.com
Tue Jan 30 21:42:36 EST 2007


Hi all,

MX doesn't reach 1220 MB/s until 4 MB messages. BMI reaches it as  
well (97.6% of line rate). :-)

Scott

BMI pingpong:

1048576		916.09		1144.62
2097152		1740.43		1204.96
4194304		3437.15		1220.29

Native MX:

    Length   Latency(us)    Bandwidth(MB/s)
   1048576     879.201       1192.647
   2097152    1729.612       1212.499
   4194304    3429.994       1222.831


On Jan 30, 2007, at 4:34 PM, Scott Atchley wrote:

> Hi all,
>
> Here is BMI pingpong performance using Opteron 285s (dual core,  
> dual socket 2.4 GHz):
>
> Length		Lat (us)	BW (MB/s)
> 1		3.90		0.26
> 64		4.38		14.60
> 128		4.87		26.29
> 256		6.34		40.37
> 512		6.85		74.75
> 1024		8.22		124.52
> 2048		9.31		219.94
> 4096		11.62		352.49
> 8192		17.04		480.67
> 32768		39.90		821.21
> 1048576		1044.83		1003.59
>
> With MX registration cache:
> 1048576		917.61		1142.72
>
>
> and native MX performance for comparison:
>
>    Length   Latency(us)    Bandwidth(MB/s)
>         1       2.200          0.455
>        64       2.763         23.167
>       128       2.908         44.024
>       256       4.439         57.677
>       512       5.129         99.825
>      1024       6.530        156.815
>      2048       7.556        271.043
>      4096      10.145        403.766
>      8192      14.743        555.635
>     32768      37.490        874.046
>   1048576     879.213       1192.630
>
> On these machines, BMI only adds about 1.7 us latency.
>
> I would normally expect MX to get about 1225 MB/s (out of the 1250  
> MB/s line rate) and I would expect BMI to get about 1200 MB/s. I  
> will look into this tomorrow.
>
> Overall, raw BMI performance is good and imposes little overhead.
>
> Scott
>
>
> On Jan 26, 2007, at 4:35 PM, Scott Atchley wrote:
>
>> Hi Murali,
>>
>> Ok, I will check them out.
>>
>> In the meantime, I have written a test similar to IMB PingPong  
>> that uses BMI directly. It should work with TCP, GM, MX, and IB.  
>> Below are some various results for some old Xeons with  
>> Myrinet-2000 cards (250 MB/s link rate).
>>
>> The latency is one-way and throughput is bi-directional.
>>
>> I will also write a version that tests unexpected messages up to  
>> unexpected max size.
>>
>> Scott
>>
>> bmi_mx results:
>>
>> Length		Lat (us)	BW (MB/s)
>> 1		7.97		0.13
>> 64		8.85		7.23
>> 256		11.99		21.35
>> 512		13.76		37.20
>> 1024		17.76		57.67
>> 4096		32.49		126.07
>> 8192		54.38		150.65
>> 32768		158.41		206.85
>> 1048576		4583.91		228.75
>>
>> With the registration cache:
>> 1048576		4305.72		243.53
>>
>>
>> For comparison, these are mx_pingpong (raw MX) results for the  
>> same message sizes:
>>
>>    Length   Latency(us)    Bandwidth(MB/s)
>>         1       3.466          0.288
>>        64       4.587         13.954
>>       256       7.141         35.852
>>       512       9.089         56.329
>>      1024      12.937         79.153
>>      4096      26.696        153.431
>>      8192      46.815        174.987
>>     32768     154.087        212.659
>>   1048576    4271.931        245.457
>>
>> BMI and/or bmi_mx adds about 4.5 us additional latency. It can get  
>> close to line rate with or without the MX registration cache.
>>
>> Scott
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list