[Pvfs2-developers] Re: noncontig-test

walt walt at CLEMSON.EDU
Mon Aug 6 13:20:28 EDT 2007


What do you mean when you say "fails?"  What you have shown here SHOULD 
produce an error - it should not crash.  The bytemax should not be less 
than bytes, and in any case should not be negative.  It seems that the 
caller has for some reason passed an inproperly set up result structure.

I haven't check the bmi code, but this appears to be a module that is 
trying to decide which servers have part of the data for this request. 
For this we usually set the bytemax to 1 (which says if there is at 
least one byte on this server, stop and let us know).  Maybe we should 
add an error check for a negative bytemax, but at least in this case it 
should have called gossip_error.

Walt

Scott Atchley wrote:
> Hi Sam,
> 
> Kyle sent me the code and I compiled it this morning.
> 
> First, I was using mpich2-mx compiled with PVFS2 support. It failed with 
> the error that MX was already initialized. Both mpich2-mx and bmi_mx are 
> calling mx_init(). I changed bmi_mx to ignore MX_ALREADY_INITIALIZED.
> 
> Second, I do not see any errors returned in bmi_mx. It fails in 
> PINT_process_request (see call trace below). The request has segs = 0,  
> bytemax = -1291, and bytes = 0.
> 
> It could well be that these values are incorrect due to a bug in bmi_mx 
> that is not flagging an error, but I have no idea.
> 
> Can you take a look at this?
> 
> Thanks,
> 
> Scott
> 
> 
> 0:  (gdb) b PINT_process_request
> 0:  Breakpoint 2 at 0x4701c8: file src/io/description/pint-request.c, 
> line 72.
> 0:  (gdb) run -fname pvfs2://mnt/pvfs2/atchley/blah -fsize 1 -timing
> 0:  Continuing.
> 0:  ========= Parameter space dump =========
> 0:  filename: pvfs2://mnt/pvfs2/atchley/blah  ionodes
> 0:  file size (MB): 1 buffer size 0
> 0:  vector length: 10 element count: 1 vector count: 0
> 0:  striping factor: 0 striping size: -1 collective buffer size: 0
> 0:  loops: 1 displacement 0
> 0:  ========= Dump done            =========
> 0:  #* no verification possible!
> 0:  calling noncontigmem_noncontigfile(pvfs2://mnt/pvfs2/atchley/blah, 
> 0x0x2aaaaaaab010, 1048560)
> 0:
> 0:  # testing noncontiguous in memory, noncontiguous in file using 
> independent I/O
> 0:  # vector count = 26214 - access count = 26214
> 0:  calling MPI_File_open(pvfs2://mnt/pvfs2/atchley/blah)
> 0:  calling MPI_File_set_view()
> 0:  calling MPI_File_seek()
> 0:  calling MPI_File_write()
> 0:  [New Thread 1082132816 (LWP 29290)]
> 0:  [New Thread 1090525520 (LWP 29291)]
> 0:
> 0:  Breakpoint 2, PINT_process_request (req=0x6aea50, mem=0x6aeb00,
> 0:      rfdata=0x7fffd112b880, result=0x7fffd112b850, mode=2)
> 0:      at src/io/description/pint-request.c:72
> 0:  72          void *temp_space = NULL;    /* temp copy of req state 
> for size call */
> 0:  (gdb) 0:  (gdb) bt
> 0:  #0  PINT_process_request (req=0x6aea50, mem=0x6aeb00, 
> rfdata=0x7fffd112b880,
> 0:      result=0x7fffd112b850, mode=2) at 
> src/io/description/pint-request.c:72
> 0:  #1  0x00000000004844e0 in io_find_target_datafiles (mem_req=0x6ad160,
> 0:      file_req=0x6ae960, file_req_offset=0, dist_p=0x6ae9c0, 
> fs_id=1825963815,
> 0:      io_type=PVFS_IO_WRITE, input_handle_array=0x6b9510, 
> input_handle_count=4,
> 0:      handle_index_array=0x6b9240, handle_index_out_count=0x7fffd112b944,
> 0:      sio_handle_index_array=0x6aea30, 
> sio_handle_index_count=0x7fffd112b940)
> 0:      at src/client/sysint/sys-io.sm:2320
> 0:  #2  0x0000000000480010 in io_datafile_setup_msgpairs (sm_p=0x6ba4a0,
> 0:      js_p=0x7fffd112b9f0) at src/client/sysint/sys-io.sm:489
> 0:  #3  0x0000000000476a66 in PINT_state_machine_next (s=0x6ba4a0,
> 0:      r=0x7fffd112b9f0) at ./src/common/misc/state-machine-fns.h:158
> 0:  #4  0x0000000000476645 in PINT_client_state_machine_post 
> (sm_p=0x6ba4a0,
> 0:      pvfs_sys_op=6, op_id=0x7fffd112bb30, user_ptr=0x0)
> 0:      at src/client/sysint/client-state-machine.c:312
> 0:  #5  0x000000000047f9fc in PVFS_isys_io (ref=
> 0:        {handle = 1048563, fs_id = 1825963815, __pad1 = 0}, 
> file_req=0x6ae960,
> 0:      file_req_offset=0, buffer=0x0, mem_req=0x6ad160, 
> credentials=0x6b8ea0,
> 0:      resp_p=0x7fffd112bba0, io_type=PVFS_IO_WRITE, op_id=0x7fffd112bb30,
> 0:      user_ptr=0x0) at src/client/sysint/sys-io.sm:328
> 0:  #6  0x000000000047facf in PVFS_sys_io (ref=
> 0:        {handle = 1048563, fs_id = 1825963815, __pad1 = 0}, 
> file_req=0x6ae960,
> 0:      file_req_offset=0, buffer=0x0, mem_req=0x6ad160, 
> credentials=0x6b8ea0,
> 0:      resp_p=0x7fffd112bba0, io_type=PVFS_IO_WRITE)
> 0:      at src/client/sysint/sys-io.sm:351
> 0:  #7  0x0000000000458cb2 in ADIOI_PVFS2_WriteStrided (fd=0x6b8d00,
> 0:      buf=0x2aaaaaaab010, count=26214, datatype=-1946157050, 
> file_ptr_type=101,
> 0:      offset=0, status=0x7fffd112be30, error_code=0x7fffd112bd70)
> 0:      at 
> /nfs/home/atchley/projects/mpich2/mpich2-snap-200706132016/src/mpi/romio/adio/ad_pvfs2/ad_pvfs2_write.c:1001 
> 
> 0:  #8  0x000000000041afcb in MPIOI_File_write (mpi_fh=0x6b8d00, offset=0,
> 0:      file_ptr_type=101, buf=0x2aaaaaaab010, count=26214, 
> datatype=-1946157050,
> 0:      myname=0x63ac74 "MPI_FILE_WRITE", status=0x7fffd112be30)
> 0:      at 
> /nfs/home/atchley/projects/mpich2/mpich2-snap-200706132016/src/mpi/romio/mpi-io/write.c:156 
> 
> 0:  #9  0x000000000041aafd in PMPI_File_write (mpi_fh=0x6b8d00,
> 0:      buf=0x2aaaaaaab010, count=26214, datatype=-1946157050,
> 0:      status=0x7fffd112be30)
> 0:      at 
> /nfs/home/atchley/projects/mpich2/mpich2-snap-200706132016/src/mpi/romio/mpi-io/write.c:52 
> 
> 0:  #10 0x000000000040461e in noncontigmem_noncontigfile (
> 0:      filename=0x668110 "pvfs2://mnt/pvfs2/atchley/blah", 
> buf=0x2aaaaaaab010,
> 0:      bufsize=1048560, dtype=-1946157050, offset=0, displs=0, 
> finfo=-1677721600,
> 0:      veclen=10, elmtcount=1, veccount=26214) at noncontig.c:185
> 0:  #11 0x000000000040738d in main (argc=1, argv=0x7fffd112c608)
> 0:      at noncontig.c:1020
> 0:  (gdb) s
> 0:  74          PVFS_offset  contig_offset = 0; /* temp for offset of a 
> contig region */
> 0:  (gdb)
> 0:  78          if (!PINT_IS_MEMREQ(mode))
> 0:  (gdb)
> 0:  79          gossip_debug(GOSSIP_REQUEST_DEBUG,
> 0:  (gdb)
> 0:  81          
> gossip_debug(GOSSIP_REQUEST_DEBUG,"PINT_process_request\n");
> 0:  (gdb)
> 0:  83          if (!req)
> 0:  (gdb)
> 0:  88          if (!result || !result->segmax || !result->bytemax)
> 0:  (gdb) p *result
> 0:  $1 = {offset_array = 0x7fffd112b8a8, size_array = 0x7fffd112b8a0, 
> segmax = 1,
> 0:    segs = 0, bytemax = -1291, bytes = 0}
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
-------------- next part --------------
A non-text attachment was scrubbed...
Name: walt.vcf
Type: text/x-vcard
Size: 229 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20070806/a2dddb2e/walt.vcf


More information about the Pvfs2-developers mailing list