[PVFS-users] setup_buffer() failure 8288 error on write pvfsdev: cannot allocate memory
milfeld@tacc.utexas.edu
milfeld@tacc.utexas.edu
Mon, 26 Jan 2004 23:22:13 -0600 (CST)
Hi,
I'm having a problem with our newly installed PVFS, and if I cannot resolve it.
We have installed the pvfs-kernel interface 1.5.6 on our 600 processor system (lonestar):
1.) The kernel version is (uname -a):
Linux <hostname> 2.4.18-27.7.xsmp #1 SMP Thu Jan 8 13:16:14 CST 2004 i686 unknown
It contains perfcrt patches for PAPI.
2.) CC is the gcc compiler.
3.) modules.config contains:
alias char-major-60 pvfs
options pvfs buffer=mapped
4.) "mknod /dev/pvfsd c 60 0" is used to create the pvfsd device file.
5.) There are 15 dual-xeon systems running dual iod's, and the master
node is on the login node.
6.) A GigE switch hierarchy uses Dell PowerConnect 5224's aggregated through 3248's
to form the network for PVFS between the compute nodes and the I/O server nodes.
Problem:
On a simple MPI-IO test, using 16-64 clients I get an I/O error 8288 (sometime many) back from
the MPI-IO write. (The code runs fine on our IBM GPFS system; and I've included
it at the end.) On the nodes where the IO error occurs we get setup_buffer and memory errors
reported below in the /var/log/kern log.
1.) Below, I have reproduced the error three times on the same node. (I performed
a "./pvfsd restart" in /etc/init.d after each failure, and resubmitted the job
using a different file for writing.) Sometimes the mount point hangs after
this error.
2.) I noticed the 8288 in previous interactions, but that number appears not to
point to a definitive problem.
3.) I have a feeling that the problem is caused by the "setup_buffer" failure,
and that it might be caused when using the "options pvfs buffer=mapped"
option above. Also, I found a previous history for this problem at:
http://www.beowulf-underground.org/pipermail/pvfs-developers/2003-January/001107.html
and in a correspondence between Murali Vilayannur and Nathan Poznick
>> Thus spake Murali Vilayannur:
>>> Oh. The process seems to have been stuck in setup_buffer(), when it
>>> received the signal? What was the command line options to insmod pvfs.o?
>>> Could you also try with insmod pvfs.o buffer=3Dstatic and see if
>>> the same thing happens?=20
>>We were using buffer=3Ddynamic (can't use mapped since we use
>>CONFIG_HIGHMEM4G, and actually have 4G in the clients). I switched it
>>to buffer=3Dstatic, re-ran the problem job, and bingo! It finishes without
>>any errors. We're running the job again now to make sure (but this is
>>the first time it's actually finished), and after that, I'm going to
>>switch back to kpvfsd with buffer=3Dstatic and run it once more.
The Question:
Is there a fix for this problem?
If so, can it be used for this version (possibly by not using a "mapped" buffer?).
It looks as though we should try a different buffer setting. What value should be used
(3Dstatic, static, etc.)?
Do we need to upgrade to fix the problem?
Thank you for your help; and more importantly, I thank all who have made PVFS a success!!!!!
Kent Milfeld
Texas Advanced Computing Center
-------------------------------- /var/log/kern --------------------------------------------
Jan 22 11:25:40 compute-3-22 kernel: (pvfsdev.c, 1356): pvfsdev: could not allocate memory!
Jan 22 11:25:40 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 22 11:25:40 compute-3-22 kernel: (pvfsdev.c, 1356): pvfsdev: could not allocate memory!
Jan 22 11:25:40 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 22 11:25:40 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600340
Jan 22 11:25:40 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600340
Jan 26 22:12:21 compute-3-22 kernel: (pvfsdev.c, 1356): pvfsdev: could not allocate memory!
Jan 26 22:12:21 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 26 22:12:21 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600620
Jan 26 22:17:09 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 26 22:17:09 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600620
Jan 26 22:17:57 compute-3-22 kernel: (pvfsdev.c, 1356): pvfsdev: could not allocate memory!
Jan 26 22:17:57 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 26 22:17:57 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600621
Jan 26 22:21:18 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 26 22:21:18 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600621
Jan 26 22:22:21 compute-3-22 kernel: (pvfsdev.c, 1356): pvfsdev: could not allocate memory!
Jan 26 22:22:21 compute-3-22 kernel: (pvfsdev.c, 1032): pvfsdev: setup_buffer() failure.
Jan 26 22:22:21 compute-3-22 kernel: (ll_pvfs.c, 553): ll_pvfs_file_write failed on 2600622
[root@compute-3-22 log]#
-------------------------------- ipcs --------------------------------------------
[root@compute-3-22 log]# ipcs
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 0 root 600 1056768 9 dest
0x00000000 32769 root 600 33554432 9 dest
0x00000000 163842 root 600 33554432 9 dest
0x00000000 98307 root 600 46084 9 dest
------ Semaphore Arrays --------
key semid owner perms nsems status
0x00000000 65536 apache 600 1
------ Message Queues --------
key msqid owner perms used-bytes messages
-------------------------------- top on compute-3-22 --------------------------------------------
10:28pm up 13 days, 7:19, 1 user, load average: 1.06, 0.92, 0.57
70 processes: 66 sleeping, 2 running, 2 zombie, 0 stopped
CPU0 states: 0.0% user, 0.1% system, 0.0% nice, 99.4% idle
CPU1 states: 100.0% user, 0.0% system, 0.0% nice, 0.0% idle
Mem: 2064716K av, 678452K used, 1386264K free, 0K shrd, 95004K buff
Swap: 2096440K av, 4972K used, 2091468K free 371792K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
10490 milfeld 25 0 39836 38M 18372 R 99.9 1.9 5:53 a.out9
-------------------------------- a.out9 (dual processor) compute-3-22 --------------------------------
[root@compute-3-22 log]# ps -elf | grep a.out9
000 R milfeld 10490 10489 99 85 0 - 12921 - 22:22 ? 00:07:49 ./a.out9
000 S milfeld 10499 10498 0 77 0 - 12919 end 22:22 ? 00:00:00 ./a.out9
-------------------------------- code --------------------------------
module comm_mpi
include 'mpif.h'
integer :: ntasks, irank, ierr_mpi
integer :: namelen
character(32) :: name
end module
module comm_mpi_io
integer, parameter :: NTIMES=1
integer, parameter :: NBYTES=1024*1024*16
real*8, dimension(NBYTES/8) :: buf
real*8 :: bcalc
end module
module comm
integer, parameter :: KI8=selected_int_kind(12)
end module
program block_sequence
use comm
use comm_mpi
use comm_mpi_io
implicit none
! character(27) :: filename="/rmount/gpfs15/milfeld/test"
! 123456789012345678901234567901234567890
! character(23) :: filename="/mnt/pvfs/milfeld/test3"
character(15) :: filename="/mnt/pvfs/test9"
integer :: nargs, iargc, length_fn
integer :: i, knt, j
integer :: fh, status(MPI_STATUS_SIZE)
integer(kind=KI8) :: offset
integer :: istart,iend,itar0,itar1,irate
real*8 :: tdiff,fmbps
!MPI Setup
call mpi_init(ierr_mpi)
call mpi_comm_size(MPI_COMM_WORLD, ntasks, ierr_mpi)
call mpi_comm_rank(MPI_COMM_WORLD, irank, ierr_mpi)
call MPI_Get_processor_name(name, namelen, ierr_mpi);
print*,"Checking in from task",irank
offset = irank*NBYTES
call system_clock(count_rate=irate)
call system_clock(count=itar0)
call system_clock(count=itar1)
call system_clock(count=istart)
do j = 1,NBYTES/8
buf(j) = j + irank*(NBYTES/8)
end do
call system("hostname; ls -l /mnt/pvfs")
do i=1, NTIMES
print*, "RANK ",irank," i=",i," Opening ", filename
! MPI_MODE_CREATE+MPI_MODE_RDWR = 8 + 4
call MPI_FILE_OPEN(MPI_COMM_WORLD, filename, &
& MPI_MODE_CREATE+MPI_MODE_RDWR, MPI_INFO_NULL, &
& fh, ierr_mpi)
print*,"OPEN IOERR=",ierr_mpi
call MPI_FILE_SEEK( fh, offset, MPI_SEEK_SET, ierr_mpi)
print*,"SEEK IOERR=",ierr_mpi
call MPI_FILE_WRITE(fh, buf, NBYTES, MPI_BYTE, status, ierr_mpi)
print*,"WRITE IOERR=",ierr_mpi," host=",name
call MPI_FILE_CLOSE(fh, ierr_mpi)
print*,"CLOSE IOERR=",ierr_mpi
call MPI_BARRIER(MPI_COMM_WORLD, ierr_mpi)
do j = 1,NBYTES/8
buf(j) = 0.0D0
end do
! call MPI_FILE_OPEN(MPI_COMM_WORLD, filename, &
! & MPI_MODE_RDWR, MPI_INFO_NULL, &
! & fh, ierr_mpi)
!
! call MPI_FILE_SEEK(fh, offset, MPI_SEEK_SET, ierr_mpi)
! call MPI_FILE_READ(fh, buf, NBYTES, MPI_BYTE, status, ierr_mpi)
! do j = 1,NBYTES/8
! bcalc = j + irank*(NBYTES/8)
!
! if( buf(j) .ne. bcalc ) then
! print*, "task=",irank," diff rear/calc =",buf(j),bcalc
! stop
! end if
! end do
!
! call MPI_FILE_CLOSE(fh, ierr_mpi)
call MPI_BARRIER(MPI_COMM_WORLD, ierr_mpi)
end do
call system_clock(count=iend)
if(irank .eq. 0) then
tdiff = real(iend-istart - (itar1-itar0))/real(irate)
fmbps = NTIMES*(NBYTES/(1024*1024))*ntasks/tdiff
print*,' time = ',tdiff, ' MB/sec =',fmbps,' for ',ntasks,' tasks.'
endif
call MPI_Finalize(ierr_mpi)
end program
! not used here********************VVVV
subroutine get_cmdln_filename(length_fn,max_length,filename)
implicit none
character(*) :: filename
integer :: length_fn,max_length
external length_string
integer :: length_string
integer :: i, nargs, iargc
do i = 1,max_length
filename(i:i) = " "
end do
nargs = iargc()
if(nargs .lt. 1 .or. nargs .gt. 2) then
print*,"ERROR: SYNTAX: <prog> [-filename <file> | <filename ]"
stop
endif
do i = 1,nargs
call getarg(i,filename)
enddo
length_fn = length_string(max_length, 1,filename)
print*,"length=",length_fn," filename.",filename(1:length_fn),"."
end subroutine
integer function length_string(max_len,idirection,string)
!
!---------------------------------------------------------------------------
! Return string length.
!
! INPUT :
! max_len Defined length of character string
! ibackward
! -1 = String ends at first non-blank in backward search
! 1 = String ends at first non-blank in forward search.
! string String to be search.
!---------------------------------------------------------------------------
implicit none
character(*) :: string
integer :: max_len,idirection
integer :: i, istart, iend
if(idirection .ne. 1 .and. idirection .ne. -1) then
print*, "ERROR: in function length_string: idirection .ne. 1 or -1"
stop "ERROR: length_string"
endif
if(idirection .eq. 1) then
istart = 1
iend = max_len
length_string = -1 !max_len
else
istart = max_len
iend = 1
length_string = -1 !1
endif
! Search for first " " character on forward search
! Search for first non-" " character on backward search
do i = istart, iend, idirection
if( idirection .eq. 1) then
if(string(i:i).eq.' ')then
length_string = i-1
return
end if
end if
if( idirection .eq.-1) then
if(string(i:i).ne.' ')then
length_string = i
return
end if
end if
end do
return
end function