[PVFS-users] Hangup is back

Brannen S Hough bshough at impactsci.com
Thu Aug 19 17:15:00 EDT 2004


 

             Hi Rob,

             Just when I thought I was out of the woods - I thought I'd
found the networking issues that caused the problems (though not directly)
was related to having 2 gigabit Ethernet cards in one machine and trying to
be on the local testing network (for the PVFS nodes) and the company network
(for general use) at the same time.  Spend a day trying to create the hangup
to get more information, but could not.

             The problem popped up again - while 2 copies of my test program
were running (one on each node of a 2 node PVFS cluster).  It took 35 solid
minutes of reading and writing files at full speed for it to show up, and
only one of the programs hung.  The other completed normally.  They were
both reading and writing files using the PVFS library calls (pvfs_open,
etc).

 

             I've got a run of 'netstat -tan' that might be useful, caught
after the good test completed and while the bad test was still hung.  I
didn't wait around to see if it would spontaneously restart.

 

Recorded 3:15 PM, Aug 19, 2004 - running 2 test programs concurrently on 2
node PVFS.

Tests running in direct mode (PVFS library calls, not standard file I/O

 

Active Internet connections (servers and established)

Proto Recv-Q Send-Q Local Address           Foreign Address         State


tcp        0      0 0.0.0.0:32768           0.0.0.0:*               LISTEN


tcp        0      0 127.0.0.1:32769         0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:32770           0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:938             0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:21              0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN


tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:7000            0.0.0.0:*               LISTEN


tcp        0      0 0.0.0.0:3000            0.0.0.0:*               LISTEN


tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN


tcp        0      0 10.0.0.3:7000           10.0.0.3:32786
ESTABLISHED 

tcp        0      0 10.0.0.3:32771          10.0.0.6:22
ESTABLISHED 

tcp        0      0 10.0.0.3:32772          10.0.0.6:22
ESTABLISHED 

tcp        0      0 10.0.0.3:32774          10.0.0.3:7000
ESTABLISHED 

tcp        0      0 10.0.0.3:32786          10.0.0.3:7000
ESTABLISHED 

tcp        0      0 10.0.0.3:3000           10.0.0.3:32784
ESTABLISHED 

tcp        0      0 10.0.0.3:32773          10.0.0.6:7000
ESTABLISHED 

tcp        0      0 10.0.0.3:7000           10.0.0.3:32774
ESTABLISHED 

tcp        0      0 10.0.0.3:32785          10.0.0.6:7000
ESTABLISHED 

tcp        0      0 10.0.0.3:32784          10.0.0.3:3000
ESTABLISHED 

 

             What do you think?  It seems odd that there are seven
references to port 7000 (the 2 iods), but there isn't any traffic built up
and pending.  Strange.

 

______________________________________
Brannen Hough
Impact Science & Technology
27 Proctor Hill Road, P.O. Box 1197
Hollis, NH. 03049
bshough at impactsci.com <mailto:bshough at impactsci.com>
(603)465-9154 x141
F:(603)465-9792 

 

Note: IST is moving to a new location by the end of August.

The new address will be 85 Northwest Blvd, Nashua, NH. 03063

The new main phone number will be (603) 459-2200.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.beowulf-underground.org/pipermail/pvfs-users/attachments/20040819/dea5f0d4/attachment.htm


More information about the PVFS-users mailing list