[Pvfs2-developers] Re: noncontig-test
Kyle Schochenmaier
kschoche at mcs.anl.gov
Thu Aug 2 16:53:04 EDT 2007
Scott Atchley wrote:
> Kyle,
>
> I still do not see any bmi_mx messages. There should be hundreds. ;-)
Yeah, I'm confused now too.... I did a `make clean && make && make
install`..
It should be in there.. oh, maybe I need to recompile the app, yeah that
helps.
New output file again, sorry about this everyone.
>
> Is the app statically linked? If so, did you recompile it?
>
> Scott
>
> On Aug 2, 2007, at 4:22 PM, Kyle Schochenmaier wrote:
>
>> Scott Atchley wrote:
>>> Kyle,
>>>
>>> Thanks.
>>>
>>> I do not see any bmi_mx error messages or any bmi_mx messages at
>>> all. Did you change BMX_DEBUG to 1 and add BMX_DB_ALL to BMX_DB_MASK
>>> and then make and make install?
>>
>> I changed them but didnt do a make clean.. redid that and got some
>> other output for you.
>> Attached this time is the correct output, heh.
>>
>> Kyle
>>>
>>> Scott
>>>
>>> On Aug 2, 2007, at 3:54 PM, Kyle Schochenmaier wrote:
>>>
>>>> Scott Atchley wrote:
>>>>> Kyle,
>>>>>
>>>>> Are you using mpich-mx or mpich or mpich2? Are you using the
>>>>> bmi_mx code in PVFS cvs? I am not sure if mpich-mx supports
>>>>> non-contiguous data.
>>>> I'm using mpich2. mpich2-1.0.5p4, and CVS head.
>>>>>
>>>>> If you are using bmi_mx that is in your cvs, please try using the
>>>>> files I sent today (I have not had a chance to update my PVFS2 cvs
>>>>> and create a patch). Error 22 is EINVAL in Linux and I actually
>>>>> used that in some of my older code.
>>>> I just built with your changes and the changes that follow, and
>>>> still have the error. I'll attach the logfile here, I'm not sure
>>>> if it makes any more sense now then it did before :-/.
>>>>
>>>>
>>>> thanks,
>>>>
>>>> Kyle
>>>>>
>>>>> Also, can you run with PVFS2_DEBUGMASK=all? Can you edit
>>>>> $PVFS2/src/io/bmi/bmi_mx/mx.h so that BMX_DEBUG is 1 and change:
>>>>>
>>>>> #define BMX_DB_MASK (BMX_DB_ERR|BMX_DB_WARN)
>>>>>
>>>>> to
>>>>>
>>>>> #define BMX_DB_MASK (BMX_DB_ERR|BMX_DB_WARN|BMX_DB_ALL)
>>>>>
>>>>> There will be a lot of output but it may point out the issue.
>>>>>
>>>>> Scott
>>>>>
>>>>> On Aug 2, 2007, at 2:56 PM, Kyle Schochenmaier wrote:
>>>>>
>>>>>> Sam and I looked into a problem we found with the noncontig-test
>>>>>> that I'm using as one of my benchmarks in my suite.
>>>>>>
>>>>>> Test setup:
>>>>>> pvfs2-fs: MX on 4 data servers, 5th server is the client. (CVS Head)
>>>>>>
>>>>>> If I run the test using MX, it will fail, but with TCP, the test
>>>>>> completes, we had originally thought that this was a problem in
>>>>>> the pint-request code (as the log will indicate) but I'm
>>>>>> wondering now why it would fail using a different transport.. To
>>>>>> clear up the obvious problems, I've run other benchmarks using
>>>>>> the same setup, before and after this error shows up and those
>>>>>> all run to completion just fine on both mx and tcp.
>>>>>>
>>>>>> Any ideas where to start with this?
>>>>>>
>>>>>> thanks,
>>>>>> Kyle
>>>>>>
>>>>>> __Output__
>>>>>>
>>>>>> TCP:
>>>>>>
>>>>>> kschoche at bb18:~/framework/noncontig-test/noncontig$ mpirun -np 1
>>>>>> ./noncontig -fname pvfs2://tmp/pvfs2/blah -fsize 1 -timing
>>>>>> ========= Parameter space dump =========
>>>>>> filename: pvfs2://tmp/pvfs2/blah ionodes
>>>>>> file size (MB): 1 buffer size 0
>>>>>> vector length: 10 element count: 1 vector count: 0
>>>>>> striping factor: 0 striping size: -1 collective buffer size: 0
>>>>>> loops: 1 displacement 0
>>>>>> ========= Dump done =========
>>>>>> #* no verification possible!
>>>>>>
>>>>>> # testing noncontiguous in memory, noncontiguous in file using
>>>>>> independent I/O
>>>>>> # vector count = 26214 - access count = 26214
>>>>>> write bandwidth (min/max/acc [MB/s]) : 0.331 / 0.331 / 0.331
>>>>>> read bandwidth (min/max/acc [MB/s]) : 0.370 / 0.370 / 0.370
>>>>>> file size: 1024kB size per process: 1023kB
>>>>>>
>>>>>> # testing noncontiguous in memory, contiguous in file using
>>>>>> independent I/O
>>>>>> # vector count = 26214 - access count = 26214
>>>>>> write bandwidth (min/max/acc [MB/s]) : 0.692 / 0.692 / 0.692
>>>>>> read bandwidth (min/max/acc [MB/s]) : 0.766 / 0.766 / 0.766
>>>>>> file size: 1023kB size per process: 1023kB
>>>>>>
>>>>>> # testing contiguous in memory, noncontiguous in file using
>>>>>> independent I/O
>>>>>> # vector count = 26214 - access count = 26214
>>>>>> write bandwidth (min/max/acc [MB/s]) : 0.348 / 0.348 / 0.348
>>>>>> read bandwidth (min/max/acc [MB/s]) : 0.392 / 0.392 / 0.392
>>>>>> file size: 1024kB size per process: 1023kB
>>>>>>
>>>>>>
>>>>>> MX:
>>>>>> kschoche at bb18:~/framework/noncontig-test/noncontig$ `mpirun -np 1
>>>>>> ./noncontig -fname pvfs2://tmp/pvfs2/blah -fsize 1 &> mx_output`
>>>>>>
>>>>>>
>>>>>> ========= Parameter space dump =========
>>>>>> filename: pvfs2://tmp/pvfs2/blah ionodes
>>>>>> file size (MB): 1 buffer size 0
>>>>>> vector length: 10 element count: 1 vector count: 0
>>>>>> striping factor: 0 striping size: -1 collective buffer size: 0
>>>>>> loops: 1 displacement 0
>>>>>> ========= Dump done =========
>>>>>> #* no verification possible!
>>>>>>
>>>>>> # testing noncontiguous in memory, noncontiguous in file using
>>>>>> independent I/O
>>>>>> # vector count = 26214 - access count = 26214
>>>>>> [E 13:39:06.029976] src/io/description/pint-request.c line 95:
>>>>>> PINT_process_requ
>>>>>> est: no segments or bytes requested!
>>>>>> [E 13:39:06.030497] [bt] ./noncontig [0x4cd655]
>>>>>> [E 13:39:06.030555] [bt] ./noncontig [0x4b2e01]
>>>>>> [E 13:39:06.030608] [bt] ./noncontig [0x4ae8f1]
>>>>>> [E 13:39:06.030658] [bt] ./noncontig [0x507b62]
>>>>>> [E 13:39:06.030707] [bt] ./noncontig [0x5080dd]
>>>>>> [E 13:39:06.030756] [bt] ./noncontig [0x507e2f]
>>>>>> [E 13:39:06.030806] [bt] ./noncontig [0x4a5030]
>>>>>> [E 13:39:06.030854] [bt] ./noncontig [0x4ae202]
>>>>>> [E 13:39:06.030903] [bt] ./noncontig [0x4ae2d5]
>>>>>> [E 13:39:06.030952] [bt] ./noncontig [0x479ab0]
>>>>>> [E 13:39:06.031001] [bt] ./noncontig [0x41df43]
>>>>>> [E 13:39:06.031072] PVFS_isys_io call: Invalid argument
>>>>>> [0] Error -524286 in MPI_File_write
>>>>>> Undefined dynamic error code
>>>>>> [E 13:39:06.067249] Warning: non PVFS2 error code (22):
>>>>>> [E 13:39:06.067468] Send immediately failed: Invalid argument
>>>>>> [E 13:39:06.067525] Send error: cancelling recv.
>>>>>> [E 13:39:06.067599] Warning: non PVFS2 error code (22):
>>>>>> [E 13:39:06.067651] msgpair failed, will retry: Invalid argument
>>>>>> [E 13:39:06.067706] *** msgpairarray_completion_fn: msgpair to
>>>>>> server mx://bb15:
>>>>>> 0:3 failed: Invalid argument
>>>>>> [E 13:39:06.067755] *** Non-BMI failure.
>>>>>> [E 13:39:06.074742] Warning: non PVFS2 error code (22):
>>>>>> [E 13:39:06.074795] Send immediately failed: Invalid argument
>>>>>> [E 13:39:06.074843] Send error: cancelling recv.
>>>>>> [E 13:39:06.074900] Warning: non PVFS2 error code (22):
>>>>>> [E 13:39:06.074948] msgpair failed, will retry: Invalid argument
>>>>>> [E 13:39:06.074998] *** msgpairarray_completion_fn: msgpair to
>>>>>> server mx://bb15:
>>>>>> 0:3 failed: Invalid argument
>>>>>> [E 13:39:06.075046] *** Non-BMI failure.
>>>>>> [E 13:39:06.075396] Warning: non PVFS2 error code (22):
>>>>>> [E 13:39:06.075447] Send immediately failed: Invalid argument
>>>>>> [E 13:39:06.075493] Send error: cancelling recv.
>>>>>> [E 13:39:06.075551] Warning: non PVFS2 error code (22):
>>>>>> [E 13:39:06.075599] msgpair failed, will retry: Invalid argument
>>>>>> [E 13:39:06.075649] *** msgpairarray_completion_fn: msgpair to
>>>>>> server mx://bb15:
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> <mx.output.bz2>
>>>
>>> _______________________________________________
>>> Pvfs2-developers mailing list
>>> Pvfs2-developers at beowulf-underground.org
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>>
>>> !DSPAM:46b23a53104412063918936!
>>>
>>
>> <nc.out2.bz2>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nc.out2.bz2
Type: application/x-bzip
Size: 45825 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20070802/36b692fc/nc.out2-0001.bin
More information about the Pvfs2-developers
mailing list