[Pvfs2-developers] Operation cancelled (possibly due to timeout) error

Phil Carns carns at mcs.anl.gov
Wed Oct 22 08:58:24 EDT 2008


brain wrote:
> * Phil Carns <carns at mcs.anl.gov> [2008 10 17, 10:46]:
>> I have two configuration suggestions that you can try.
> 
> Hello Phil, hello everybody,
> 
>  I followed your suggestions and I verified that adding
> 
>  TroveMethod alt-aio
> 
>  *and* increasing the ServerJobFlowTimeoutSecs to 300 I don't get any
> error, even under heavy load. Now I am wondering if the performances I
> get are acceptable or I am just 'hiding' the underlying problem.
> 
>  So, I first tried to write to the ext3 slice directly, from every
> single host, without using pvfs2, with the following command:
> 
> dd if=/dev/zero of="bigfile" bs=4K count=2000000
> 
>  It gave me an average of 24 Mb/s with 6 hosts writing at the same time;
> with a single host writing the value is 59 Mb/s.
> 
>  The same test using pvfs2 resulted in an average 5.7 Mb/s with 5 hosts
> writing (on a total of 6 hosts: 1 metadata, 5 I/O), which means 6 times
> slower than direct (but parallel) I/O. Is that the "price to pay" :)
> for the distributed filesystem?
> 
>  If you think that the test I did wasn't meaningful, or need more data
> to evaluate it, please just ask, and I'll post all the details.

I think that you need to try a larger block size in dd.  The 4K access 
size is not going to perform well on PVFS.  Do you get better results if 
you switch to bs=16M and adjust the count accordingly?

There is a little bit more information on this issue here:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00072000000000000000

It may also be interesting to try adding conv=fsync both cases to see 
how the performance changes if you add in the cost of flushing to disk 
at the end.

thanks,
-Phil


More information about the Pvfs2-developers mailing list