[Pvfs2-developers] Re: noncontig-test
Kyle Schochenmaier
kschoche at mcs.anl.gov
Thu Aug 2 15:54:46 EDT 2007
Scott Atchley wrote:
> Kyle,
>
> Are you using mpich-mx or mpich or mpich2? Are you using the bmi_mx
> code in PVFS cvs? I am not sure if mpich-mx supports non-contiguous data.
I'm using mpich2. mpich2-1.0.5p4, and CVS head.
>
> If you are using bmi_mx that is in your cvs, please try using the
> files I sent today (I have not had a chance to update my PVFS2 cvs and
> create a patch). Error 22 is EINVAL in Linux and I actually used that
> in some of my older code.
I just built with your changes and the changes that follow, and still
have the error. I'll attach the logfile here, I'm not sure if it makes
any more sense now then it did before :-/.
thanks,
Kyle
>
> Also, can you run with PVFS2_DEBUGMASK=all? Can you edit
> $PVFS2/src/io/bmi/bmi_mx/mx.h so that BMX_DEBUG is 1 and change:
>
> #define BMX_DB_MASK (BMX_DB_ERR|BMX_DB_WARN)
>
> to
>
> #define BMX_DB_MASK (BMX_DB_ERR|BMX_DB_WARN|BMX_DB_ALL)
>
> There will be a lot of output but it may point out the issue.
>
> Scott
>
> On Aug 2, 2007, at 2:56 PM, Kyle Schochenmaier wrote:
>
>> Sam and I looked into a problem we found with the noncontig-test that
>> I'm using as one of my benchmarks in my suite.
>>
>> Test setup:
>> pvfs2-fs: MX on 4 data servers, 5th server is the client. (CVS Head)
>>
>> If I run the test using MX, it will fail, but with TCP, the test
>> completes, we had originally thought that this was a problem in the
>> pint-request code (as the log will indicate) but I'm wondering now
>> why it would fail using a different transport.. To clear up the
>> obvious problems, I've run other benchmarks using the same setup,
>> before and after this error shows up and those all run to completion
>> just fine on both mx and tcp.
>>
>> Any ideas where to start with this?
>>
>> thanks,
>> Kyle
>>
>> __Output__
>>
>> TCP:
>>
>> kschoche at bb18:~/framework/noncontig-test/noncontig$ mpirun -np 1
>> ./noncontig -fname pvfs2://tmp/pvfs2/blah -fsize 1 -timing
>> ========= Parameter space dump =========
>> filename: pvfs2://tmp/pvfs2/blah ionodes
>> file size (MB): 1 buffer size 0
>> vector length: 10 element count: 1 vector count: 0
>> striping factor: 0 striping size: -1 collective buffer size: 0
>> loops: 1 displacement 0
>> ========= Dump done =========
>> #* no verification possible!
>>
>> # testing noncontiguous in memory, noncontiguous in file using
>> independent I/O
>> # vector count = 26214 - access count = 26214
>> write bandwidth (min/max/acc [MB/s]) : 0.331 / 0.331 / 0.331
>> read bandwidth (min/max/acc [MB/s]) : 0.370 / 0.370 / 0.370
>> file size: 1024kB size per process: 1023kB
>>
>> # testing noncontiguous in memory, contiguous in file using
>> independent I/O
>> # vector count = 26214 - access count = 26214
>> write bandwidth (min/max/acc [MB/s]) : 0.692 / 0.692 / 0.692
>> read bandwidth (min/max/acc [MB/s]) : 0.766 / 0.766 / 0.766
>> file size: 1023kB size per process: 1023kB
>>
>> # testing contiguous in memory, noncontiguous in file using
>> independent I/O
>> # vector count = 26214 - access count = 26214
>> write bandwidth (min/max/acc [MB/s]) : 0.348 / 0.348 / 0.348
>> read bandwidth (min/max/acc [MB/s]) : 0.392 / 0.392 / 0.392
>> file size: 1024kB size per process: 1023kB
>>
>>
>> MX:
>> kschoche at bb18:~/framework/noncontig-test/noncontig$ `mpirun -np 1
>> ./noncontig -fname pvfs2://tmp/pvfs2/blah -fsize 1 &> mx_output`
>>
>>
>> ========= Parameter space dump =========
>> filename: pvfs2://tmp/pvfs2/blah ionodes
>> file size (MB): 1 buffer size 0
>> vector length: 10 element count: 1 vector count: 0
>> striping factor: 0 striping size: -1 collective buffer size: 0
>> loops: 1 displacement 0
>> ========= Dump done =========
>> #* no verification possible!
>>
>> # testing noncontiguous in memory, noncontiguous in file using
>> independent I/O
>> # vector count = 26214 - access count = 26214
>> [E 13:39:06.029976] src/io/description/pint-request.c line 95:
>> PINT_process_requ
>> est: no segments or bytes requested!
>> [E 13:39:06.030497] [bt] ./noncontig [0x4cd655]
>> [E 13:39:06.030555] [bt] ./noncontig [0x4b2e01]
>> [E 13:39:06.030608] [bt] ./noncontig [0x4ae8f1]
>> [E 13:39:06.030658] [bt] ./noncontig [0x507b62]
>> [E 13:39:06.030707] [bt] ./noncontig [0x5080dd]
>> [E 13:39:06.030756] [bt] ./noncontig [0x507e2f]
>> [E 13:39:06.030806] [bt] ./noncontig [0x4a5030]
>> [E 13:39:06.030854] [bt] ./noncontig [0x4ae202]
>> [E 13:39:06.030903] [bt] ./noncontig [0x4ae2d5]
>> [E 13:39:06.030952] [bt] ./noncontig [0x479ab0]
>> [E 13:39:06.031001] [bt] ./noncontig [0x41df43]
>> [E 13:39:06.031072] PVFS_isys_io call: Invalid argument
>> [0] Error -524286 in MPI_File_write
>> Undefined dynamic error code
>> [E 13:39:06.067249] Warning: non PVFS2 error code (22):
>> [E 13:39:06.067468] Send immediately failed: Invalid argument
>> [E 13:39:06.067525] Send error: cancelling recv.
>> [E 13:39:06.067599] Warning: non PVFS2 error code (22):
>> [E 13:39:06.067651] msgpair failed, will retry: Invalid argument
>> [E 13:39:06.067706] *** msgpairarray_completion_fn: msgpair to server
>> mx://bb15:
>> 0:3 failed: Invalid argument
>> [E 13:39:06.067755] *** Non-BMI failure.
>> [E 13:39:06.074742] Warning: non PVFS2 error code (22):
>> [E 13:39:06.074795] Send immediately failed: Invalid argument
>> [E 13:39:06.074843] Send error: cancelling recv.
>> [E 13:39:06.074900] Warning: non PVFS2 error code (22):
>> [E 13:39:06.074948] msgpair failed, will retry: Invalid argument
>> [E 13:39:06.074998] *** msgpairarray_completion_fn: msgpair to server
>> mx://bb15:
>> 0:3 failed: Invalid argument
>> [E 13:39:06.075046] *** Non-BMI failure.
>> [E 13:39:06.075396] Warning: non PVFS2 error code (22):
>> [E 13:39:06.075447] Send immediately failed: Invalid argument
>> [E 13:39:06.075493] Send error: cancelling recv.
>> [E 13:39:06.075551] Warning: non PVFS2 error code (22):
>> [E 13:39:06.075599] msgpair failed, will retry: Invalid argument
>> [E 13:39:06.075649] *** msgpairarray_completion_fn: msgpair to server
>> mx://bb15:
>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mx.output.bz2
Type: application/x-bzip
Size: 19353 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20070802/59e65218/mx.output-0001.bin
More information about the Pvfs2-developers
mailing list