[Pvfs2-developers] Operation cancelled (possibly due to
timeout) error
brain
brain at autistici.org
Wed Oct 22 05:47:51 EDT 2008
* Phil Carns <carns at mcs.anl.gov> [2008 10 17, 10:46]:
> I have two configuration suggestions that you can try.
Hello Phil, hello everybody,
I followed your suggestions and I verified that adding
TroveMethod alt-aio
*and* increasing the ServerJobFlowTimeoutSecs to 300 I don't get any
error, even under heavy load. Now I am wondering if the performances I
get are acceptable or I am just 'hiding' the underlying problem.
So, I first tried to write to the ext3 slice directly, from every
single host, without using pvfs2, with the following command:
dd if=/dev/zero of="bigfile" bs=4K count=2000000
It gave me an average of 24 Mb/s with 6 hosts writing at the same time;
with a single host writing the value is 59 Mb/s.
The same test using pvfs2 resulted in an average 5.7 Mb/s with 5 hosts
writing (on a total of 6 hosts: 1 metadata, 5 I/O), which means 6 times
slower than direct (but parallel) I/O. Is that the "price to pay" :)
for the distributed filesystem?
If you think that the test I did wasn't meaningful, or need more data
to evaluate it, please just ask, and I'll post all the details.
> One final configuration option that you can try is changing
> "TroveSyncData no" to "TroveSyncData yes". I would suggest saving that
> one for last after you have resolved your timeout problem, and then try
> your benchmark with both settings.
That's just what I'll do in the next days :)
Thank you very much!
More information about the Pvfs2-developers
mailing list