[Pvfs2-users] romio problems
Jan Lindheim
lindheim at cacr.caltech.edu
Thu Mar 22 11:57:22 EST 2007
>> We have found that when trying to use pvfs with romio under openmpi,
>> we are getting errors when the task count is bigger than 128, using
>> 1MB messages. Smaller message sizes and larger task counts also cause
>> the same error to be generated, just not as consistently or quickly.
>> Errors that we see look like:
>>
>> [E 15:05:50.012128] job_time_mgr_expire: job time out: cancelling
bmi operation, job_id: 34.
>> [E 15:05:50.012380] msgpair failed, will retry: Operation cancelled
(possibly due to timeout)
>Just want to understand your workload a bit:
>You are doing a collective write with 128 processes each writing 1MB,
right?
The code is not using collective writes.
>> Writing to an NFS mounted file system instead of PVFS, works fine even
>> with 256 tasks.
>> Our version of PVFS is 2.6.2. Both openmpi 1.1.x and 1.2 produce the
>> same errors. Any known limitations with romio and PVFS?
>> We can supply you with a test code if you are interested in reproducing
>> the problem. The code should compile well with mpich as well as
> openmpi.
>Go ahead and send the test code, but it really looks like you are
>pushing the servers hard and hitting a timeout. How many servers do
>you have for this many clients? PVFS should be smarter about such a
>situation, but could you check something for us? In your fs.conf,
>what is the value of ServerJobBMITimeoutSecs ?
>http://www.pvfs.org/pvfs2-options.html#ServerJobBMITimeoutSecs
>If you increase that value to, say, 3600, we can ensure the timeouts
>won't get triggered.
>I have a few other ideas, but let's try this one first.
>==rob
>--
>Rob Latham
>Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
>Argonne National Lab, IL USA B29D F333 664A 4280 315B
For this PVFS file system, we are using 8 I/O servers and one meta data
server. I have adjusted the value of ServerJobBMITimeoutSecs on all
the servers involved. They had the default value of 30. I will try
to schedule an interrupt later today, to restart the pvfs2-server
processes. I will let you know how the next test goes after this.
Attached is the test code. The tar-ball contains two subdirectories,
utilities and mpi_io_test. You need to cd into mpi_io_test/src. Here
you'll find a README file, which describes the problems we see on our
cluster, specifics about our sw env., how to build the code and how to
run the code.
Jan Lindheim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpi-io-test.tar
Type: application/x-tar
Size: 972800 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-users/attachments/20070322/075d4597/mpi-io-test-0001.tar
More information about the Pvfs2-users
mailing list