[PVFS-users] random ll_pvfs_file_write ...downcall

Kent F. Milfeld milfeld at tacc.utexas.edu
Fri Feb 13 00:45:51 EST 2004


Hi,

 

  We just recently installed 1.6.2 (after successfully running 1.5.x).

  When I run 16-processor mpi-io jobs, the IO will sometimes fail

  with the following error information in the code (usually /mnt/pvfs

  will become unmounted):

 

...

 rank=           9  CLOSE IOERR=           0

 rank=          10  WRITE IOERR=           0    host=compute-9-23

 

 rank=          10  CLOSE IOERR=           0

 rank=           8  WRITE IOERR=           0    host=compute-9-30

 

 rank=           8  CLOSE IOERR=           0

 rank=           6  WRITE IOERR=        8288    host=compute-1-7

...

 

  In the /var/log/kern on compute-1-7 I found the following information:

 

 

 

************************************************************************
********

 

Two runs,  one about 00:09:30 and the other about ~00:17.

compute-1-11

Feb 13 00:09:30 compute-1-11 kernel: (ll_pvfs.c, 665):
ll_pvfs_file_write got error in downcall

Feb 13 00:14:42 compute-1-11 kernel: (ll_pvfs.c, 459): ll_pvfs_getmeta
failed on enqueue for 146.6.250.1:3000/pvfs-meta

compute-2-28

compute-1-0

compute-4-31

compute-2-4

compute-1-9

compute-1-12

Feb 13 00:18:56 compute-1-12 kernel: (pvfsdev.c, 1118): pvfsdev:
setup_buffer() failure.

Feb 13 00:18:56 compute-1-12 kernel: (ll_pvfs.c, 659):
ll_pvfs_file_write failed on 2600340

compute-2-31

 

Some results from two days earlier:

Feb 10 15:27:25 compute-1-7 kernel: pvfs: debug = 0x0, maxsz = 16777216
bytes, buffer = dynamic, major = 0

Feb 11 16:04:14 compute-1-7 kernel: (ll_pvfs.c, 233): ll_pvfs_create
failed on enqueue for 146.6.250.1:3000/pvfs-meta/test18

Feb 11 16:04:14 compute-1-7 kernel: (ll_pvfs.c, 87): ll_pvfs_lookup
failed on enqueue for 146.6.250.1:3000/pvfs-meta/test18

Feb 12 17:56:32 compute-1-7 kernel: (ll_pvfs.c, 665): ll_pvfs_file_write
got error in downcall

 

 

 

*********************************************************

 

[root at compute-1-30 root]# rpm -qa | grep pvfs

pvfs-1.6.2-1

contrib-pvfs-config-1.6.2-1

pvfs-kernel-1.6.2-1

 

 

 

Any idea of what might be happening?

 

Thanks,

Kent Milfeld

TACC, Texas Advanced Computing Center

 

Kent Milfeld  Ph.D.  Research Associate
Texas Advanced Computing Center
The University of Texas at Austin
http://www.tacc.utexas.edu/  

(512) 475-9411 (main)
(512) 475-9458 (direct)
(512) 475-9445 (fax)
milfeld at tacc.utexas.edu 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.beowulf-underground.org/pipermail/pvfs-users/attachments/20040213/15c54f2f/attachment.htm


More information about the PVFS-users mailing list