[Pvfs2-users] IOR errors

Kyle Schochenmaier kschoche at scl.ameslab.gov
Thu Apr 5 20:57:42 EDT 2007


Excellent, we'd love to have some more feedback on the openib port, it 
seems there are only a few users,
we/I will be more than happy to assist with any problems you have with it.
As far as the disk space problem, I believe if there is un-even striping 
across servers you can get into a situation where one disk fills
faster than the others and run into problems, I'm not sure whether or 
not that is still a problem though, I havent followed those
discussions very closely.. but it seems you've got ample disk space to 
conduct your tests.

Good luck,
and keep us posted :-)

Kyle

Carlson, Timothy S wrote:
> Nobody really got back to me on this, but I've switched over to the Open
> Fabrics stack due to some other issues I was facing with the Topspin
> stack. So I'll give it another go against Open Fabrics and let the list
> know how that went. If it goes bad, I'll try ipoib. 
>
> These are all Dell 1950 x86_64 Woodcrest boxes.
>
> Disk space should not have been an issue. Each node has about 100G of
> free disk. 
>
> Thanks
>
> Tim 
>
> -----Original Message-----
> From: Murali Vilayannur [mailto:murali.vilayannur at gmail.com] 
> Sent: Thursday, April 05, 2007 4:31 AM
> To: Carlson, Timothy S
> Cc: pvfs2-users at beowulf-underground.org
> Subject: Re: [Pvfs2-users] IOR errors
>
> Hi Tim,
> I don't know if anyone responded to this email or if it got lost..
> You could try a couple of  things and also provide some more
> information,
> - Are these Opteron/x86_64 boxes?
> - Can you try this out on tcp if possible instead of ib? That will help
> us rule out any IB specific oddities?
> - writes may have hit ENOSPC on one or more servers.. Would it be
> possible to check the amt of available disk space on all the servers?
>  I will try to reproduce this on a much smaller run although I doubt if
> anything would show up since the nightlies would have got those..
> Sorry for not being able to help better..
> Thanks,
> Murali
>
>
> On 3/21/07, Carlson, Timothy S <Timothy.Carlson at pnl.gov> wrote:
>   
>> Thanks to the folks who helped me out yesterday I got a nice little 
>> 2.3T
>> pvfs2 (2.6.2) file system. I have 16 nodes that are all acting as I/O 
>> servers and clients. 1 of those boxes is also the meta data server.  
>> All over Topspin IB and I am using all the default setting in my 
>> config file parameters.
>>
>> That being said, I wanted to test the bandwidth so I compiled the 
>> POSIX version of IOR against the Topspin mpich libraries.
>>
>> My run looks like this.
>>
>> IOR-2.9.4: MPI Coordinated Test of Parallel I/O
>>
>> Run began: Wed Mar 21 16:06:04 2007
>> Command line used: /home/tim/IOR -i 8 -b 1024m -o 
>> /mnt/pvfs2/ior/ior_16g
>> Machine: Linux compute-0-15.local
>>
>> Summary:
>>         api                = POSIX
>>         test filename      = /mnt/pvfs2/ior/ior_16g
>>         access             = single-shared-file
>>         clients            = 16 (1 per node)
>>         repetitions        = 8
>>         xfersize           = 262144 bytes
>>         blocksize          = 1 GiB
>>         aggregate filesize = 16 GiB
>>
>> access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)
>> close(s)   iter
>> ------    ---------  ---------- ---------  --------   --------
>> --------   ----
>> write     613.70     1048576    256.00     0.177541   26.43      7.24
>> 0
>> read      1141.20    1048576    256.00     0.019199   14.34
>> 0.329994   0
>> write     589.05     1048576    256.00     0.154706   27.74      7.06
>> 1
>> read      1032.93    1048576    256.00     0.019723   15.84
>> 0.417178   1
>> write     550.66     1048576    256.00     0.991332   29.58      8.43
>> 2
>> read      1005.48    1048576    256.00     0.021340   16.28
>> 0.448091   2
>> write     555.06     1048576    256.00     0.232900   29.48      8.57
>> 3
>> read      1006.24    1048576    256.00     0.018788   16.27
>> 0.263041   3
>> WARNING: Expected aggregate file size       = 17179869184.
>> WARNING: Stat() of aggregate file size      = 13958643712.
>> WARNING: Using actual aggregate bytes moved = 17179869184.
>> write     438.87     1048576    256.00     0.238877   37.23      15.80
>> 4
>> ** error **
>> ERROR in aiori-POSIX.c (line 245): hit EOF prematurely.
>> ERROR: Success
>> ** exiting **
>> ** error **
>> ERROR in aiori-POSIX.c (line 245): hit EOF prematurely.
>>
>>
>> I would say that the performance is quite good until I get to those 
>> errors. Nothing interesting in the client or server logs. Something in
>>     
>
>   
>> my IOR setup that might be stressing things a bit too hard?
>>
>> Thanks for any insights.
>>
>> Tim
>>
>>
>> Tim Carlson
>> Voice: (509) 376 3423
>> Email: Tim.Carlson at pnl.gov
>> Pacific Northwest National Laboratory
>> HPCaNS: High Performance Computing and Networking Services
>>
>> _______________________________________________
>> Pvfs2-users mailing list
>> Pvfs2-users at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
>>     
>
> _______________________________________________
> Pvfs2-users mailing list
> Pvfs2-users at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
> !DSPAM:46157f24288771657414402!
>
>   



More information about the Pvfs2-users mailing list