[PVFS-users] pvfs 1.6.3 causing possible kernel crash
Rob Ross
rross at mcs.anl.gov
Wed Dec 22 15:32:52 EST 2004
Hi Shawn,
So to be clear, you're saying that without the file system mounted your
node locked up?
If that is the case, I think that you're right on target with
stress-testing the nodes. I'd suggest something like memtester
(http://www.qcc.ca/~charlesc/software/memtester/memtester-4.0.4.tar.gz) to
thoroughly beat on the memory subsystem. I'd also look for something to
really trounce the network between that machine and another. Let me know
if you know of or find a good tool for that, so that I can suggest it in
these cases.
Thanks,
Rob
On Wed, 22 Dec 2004, Shawn Needham wrote:
> Rob, et all,
>
> The 3rd run I tried ended up locking ellipse12 again. This is the node
> that has been persitently crashing and along with ellipse10 is the only
> node to have crashed in the past 2 weeks.
>
> I have been running hardware tests on these 2 nodes this morning and the
> myrinet-based tests have turned up nothing so far. I'm going to get some
> hardware tests going that should stress other features on these nodes.
>
> I started another set of runs on all of the nodes except 10 and 12 with
> Myrinet and a newly created /mnt/pvfs-2. I'm going to continue running
> these jobs with their I/O done through the interface rather than having
> the pvfs client mounted.
>
> I'll update you on these runs later today.
>
> -Shawn
>
> On Tue, 21 Dec 2004, Shawn Needham wrote:
>
> > So far this is working well and the first run has almost completed. I'm
> > going to try 3 more runs without the client running and raising nend on
> > the problem. If these complete sucessfully I would have more confidence
> > pinpointing the pvfs module as the problem.
> >
> > I did have 2 runs finish with the old configuration, but the 2 runs
> > after that ended up taking 4 runs each to complete. I should have all of
> > these done by tonight if there are no crashes.
> >
> > -Shawn
> > On Tue, 21 Dec 2004, Rob Ross wrote:
> >
> > > Excellent. Please let me know if things continue to run effectively, then
> > > we can think about how to address the remaining issues.
> > >
> > > Rob
> > >
> > > On Tue, 21 Dec 2004, Shawn Needham wrote:
> > >
> > > > No, I hadn't done that. I guess I glossed over that that's necessary
> > > > when you are running without the filesystem mounted.
> > > >
> > > > It's now running and using the interface correctly.
> > > >
> > > > Thanks,
> > > > Shawn
> > > > On Tue, 21 Dec 2004, Rob Ross wrote:
> > > >
> > > > > Did you prefix the filename with "pvfs:"?
> > > > >
> > > > > Rob
> > > > >
> > > > > On Tue, 21 Dec 2004, Shawn Needham wrote:
> > > > >
> > > > > > Hi Rob,
> > > > > >
> > > > > > I have it set up so all of our compute node are I/O servers. 16MB or
> > > > > > more on the clients should not be a problem.
> > > > > >
> > > > > > Also, I wrote a bit too soon as I had been doing runs without any I/O when
> > > > > > writing that last email. When trying to run with I/O through the pvfs
> > > > > > application interface I'm running into an error when it attempts to open
> > > > > > the first file for writing.
> > > > > >
> > > > > > Attached you will find a run2.log file that captures the output from this
> > > > > > run.
> > > > > >
> > > > > > I built pvfs support into the Myrinet's mpich implementation, but since
> > > > > > this is not working, I'm not sure if the native pvfs interface is
> > > > > > configured and available for Myrinet's MPI-IO.
> > > > > >
> > > > > > If you would like to set configuration for the Myrinet MPICH installation
> > > > > > let me know and I can provide those details.
> > > > > >
> > > > > > Thank,
> > > > > > Shawn
> > > > > > On Tue, 21 Dec 2004, Rob Ross wrote:
> > > > > >
> > > > > > > Hi Shawn,
> > > > > > >
> > > > > > > The size of the static buffer will define an upper bound on the size of an
> > > > > > > I/O operation. So it sort of depends on your number of I/O servers and
> > > > > > > how much free space you have :). If you can spare 16MB of space on your
> > > > > > > clients, try setting it to that.
> > > > > > >
> > > > > > > Rob
> > > > > > >
> > > > > > > On Tue, 21 Dec 2004, Shawn Needham wrote:
> > > > > > >
> > > > > > > > I've started the runs without the modules loaded. I'll let you know how
> > > > > > > > they go.
> > > > > > > >
> > > > > > > > Also, do you have a suggestion for a size for the static buffer for the
> > > > > > > > pvfs module. I'll start perusing the mail list for this information as
> > > > > > > > well.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Shawn
> > > > > > > >
> > > > > > > > On Tue, 21 Dec 2004, Rob Ross wrote:
> > > > > > > >
> > > > > > > > > Thanks Shawn. I think that maxsz is ok. I agree that your users will
> > > > > > > > > want to have the kernel mounted. I was hoping you could run some
> > > > > > > > > experiments to see if that in fact eliminated the problem, so that we
> > > > > > > > > would know for a fact that we weren't barking up the wrong tree.
> > > > > > > > >
> > > > > > > > > Can you do that?
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > Rob
> > > > > > > > >
> > > > > > > > > On Tue, 21 Dec 2004, Shawn Needham wrote:
> > > > > > > > >
> > > > > > > > > > Hi Rob,
> > > > > > > > > >
> > > > > > > > > > On Tue, 21 Dec 2004, Rob Ross wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Shawn,
> > > > > > > > > > >
> > > > > > > > > > > Can you send us the configure output from the pvfs-kernel build? I'd like
> > > > > > > > > > > to know what options it picked.
> > > > > > > > > >
> > > > > > > > > > I've attached the log from the pvfs-1.6.3-kernel-2.4 build. If you need
> > > > > > > > > > more than this let me know.
> > > > > > > > > > >
> > > > > > > > > > > Also, when you load the module, are you specifying a specific "buffer="
> > > > > > > > > > > argument? If you could grab the line out of dmesg that is printed at
> > > > > > > > > > > module load time, that would be helpful.
> > > > > > > > > >
> > > > > > > > > > Rob, I didn't set any special buffer argument. I'm just relying on the
> > > > > > > > > > default setting for that argument which is to handle it dynamically. I'm
> > > > > > > > > > guessing that setting this statically could greatly change it's
> > > > > > > > > > stability/performance. Could let me know a good size to set this
> > > > > > > > > > buffer to for our configuration.
> > > > > > > > > >
> > > > > > > > > > Dec 21 11:38:33 ellipse2 kernel: pvfs: debug = 0x0, maxsz = 16777216
> > > > > > > > > > bytes, buffer = dynamic, major = 0
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > If this is a PVFS problem, it is most likely in the kernel code; otherwise
> > > > > > > > > > > you shouldn't get hangs.
> > > > > > > > > > >
> > > > > > > > > > > I don't understand why you're getting much traffic at all across the
> > > > > > > > > > > kernel interface with the FLASH code; I think that it is likely that
> > > > > > > > > > > something is misconfigured there in the I/O stack. Can you try prepending
> > > > > > > > > > > a "pvfs:" to your file names and see if that helps? Alternatively, you
> > > > > > > > > > > could create an /etc/pvfstab describing the mount point and not mount the
> > > > > > > > > > > file system at all, eliminating the pvfs-kernel piece from the equation
> > > > > > > > > > > entirely.
> > > > > > > > > >
> > > > > > > > > > I have created the /etc/pvfstab entry for the mount point. So applications
> > > > > > > > > > should be able to interface with the filesystem without being mounted.
> > > > > > > > > > However, without having the kernel loaded and the client running is there
> > > > > > > > > > a way to access those directories with standard Linux commands (ls, cp,
> > > > > > > > > > etc). For testing not mounting these directories would be fine, but I'm
> > > > > > > > > > not sure it would offer an ideal situation down the road where user's
> > > > > > > > > > would want access to there data while running. I guess the latter point is
> > > > > > > > > > just something to keep in mind for now.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Shawn
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Rob
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 20 Dec 2004, Shawn Needham wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Greetings,
> > > > > > > > > > > >
> > > > > > > > > > > > In the past week I have been attempting to run our FLASH application on
> > > > > > > > > > > > our cluster using pvfs-1.6.3.
> > > > > > > > > > > >
> > > > > > > > > > > > We have been experiencing crashes on one of our compute nodes that
> > > > > > > > > > > > appears to be caused by software with direct access to the kernel. The
> > > > > > > > > > > > nodes are freezing, you can no longer ping the interface for the Gb
> > > > > > > > > > > > switch, the terminal is blacked out and unresponsive to keyboard input,
> > > > > > > > > > > > and no diagnostic information is written to the log files. The only 2
> > > > > > > > > > > > modules loaded on our compute nodes are gm (for a Myrinet's MPICH-gm)
> > > > > > > > > > > > and pvfs, which we are running as a service over the Gb switch.
> > > > > > > > > > > >
> > > > > > > > > > > > We have seen the crash occur with the exact same symptoms when trying to
> > > > > > > > > > > > run our application on a purely Gb based configuration (mpich/hdf5 p4
> > > > > > > > > > > > configuration and pvfs running over Gb based switch) and with the mixed
> > > > > > > > > > > > configuration described above. Actually the pure Gb configuration has
> > > > > > > > > > > > performed much worse and the crashes have always happened much quicker
> > > > > > > > > > > > in terms the amount of progress in our runs. Some of the Myrinet runs
> > > > > > > > > > > > have run to completion (though this hasn't been consistent when trying
> > > > > > > > > > > > the exact same run), but I never had any success with the Gb based runs
> > > > > > > > > > > > during which neither Myrinet did not have any services running and it's
> > > > > > > > > > > > module was not loaded.
> > > > > > > > > > > >
> > > > > > > > > > > > I was wondering if you have heard of pvfs causing problems similar to
> > > > > > > > > > > > these. Also if one of the developers would like to have an account on
> > > > > > > > > > > > the machine to investigate this further that would be tremendous. I have
> > > > > > > > > > > > been working with developers at Myrinet for the past week and their
> > > > > > > > > > > > ability to log on has proved beneficial. I'm open to trying pvfs2 if the
> > > > > > > > > > > > kernel module is more robust than pvfs-1.6.3 but would like to at least
> > > > > > > > > > > > get your insights into the current problem we are having with this
> > > > > > > > > > > > installation.
> > > > > > > > > > > >
> > > > > > > > > > > > I'm not sure if this relevant, but on two of our earliest runs some
> > > > > > > > > > > > corrupted file were produced right before our runs crashed. These files
> > > > > > > > > > > > were written to directories in the pvfs file system.
> > > > > > > > > > > >
> > > > > > > > > > > > I just reinstalled a new kernel this afternoon and rebuilt all the
> > > > > > > > > > > > software, scrapped all old pvfs directories and rebuilt them and my
> > > > > > > > > > > > Myrinet based run ended up dieing in the same manner as the others.
> > > > > > > > > > > >
> > > > > > > > > > > > I've also appending some information about the hardware/software
> > > > > > > > > > > > configuration of our machine.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Shawn
> > > > > > > > > > > >
> > > > > > > > > > > > Here's the software/hardware information.
> > > > > > > > > > > >
> > > > > > > > > > > > Hardware. 17 node (1 master/16 compute) cluster from Aspen Systems with
> > > > > > > > > > > > the Myrinet M3-E32 switch and a Gb switch (24 port HP Procurve 2724,
> > > > > > > > > > > > J4897a).
> > > > > > > > > > > > Nodes: e7505 chipset, dual 2.8Ghz xeon processors 4GB memory. PCIXE-2
> > > > > > > > > > > > dual channel myrinet cards, cards are in 133Mhz PCI slots, Intel Corp.
> > > > > > > > > > > > 82545EM Gigabit Ethernet Controller. Nvidia 5950 Geforce FX Ultra.
> > > > > > > > > > > >
> > > > > > > > > > > > We are running a stock 2.4.27 kernel from kernel.org. The current
> > > > > > > > > > > > running kernel is a 2.4.27smp-64GB-noAGPsupport. We initially installed
> > > > > > > > > > > > RH9 on this system and tried to take as much of the RedHat features out
> > > > > > > > > > > > as possible, but system parameters are still consistent with RH values.
> > > > > > > > > > > > Here's /proc/version info.
> > > > > > > > > > > >
> > > > > > > > > > > > [root at ellipse0 ~]# cat /proc/version
> > > > > > > > > > > > Linux version 2.4.27smp-64gb-noagp (root at ellipse0) (gcc version 3.2.2
> > > > > > > > > > > > 20030222 (Red Hat Linux 3.2.2-5)) #1 SMP Mon Dec 20 12:14:14 CST 2004
> > > > > > > > > > > >
> > > > > > > > > > > > Myrinet configuration
> > > > > > > > > > > >
> > > > > > > > > > > > gm-2.1.6 (linux-2.4 - ia32)
> > > > > > > > > > > > pvfs-1.6.3 (running over Gb switch)
> > > > > > > > > > > > mpich-1.2.6..13b-gm * ifc 8.1 (Version 8.1 (l_fc_p_8.1.018))
> > > > > > > > > > > > * icc 8.1 (Version 8.1 (l_cc_p_8.1.021))
> > > > > > > > > > > > hdf5-1.6.2 (built against this mpich)
> > > > > > > > > > > >
> > > > > > > > > > > > Gb configuration (all services running over Gb based switch):
> > > > > > > > > > > >
> > > > > > > > > > > > pvfs-1.6.3
> > > > > > > > > > > > mpich-1.2.6-p4
> > > > > > > > > > > > ifc 8.1 (Version 8.1 Build 20040803Z Package ID:
> > > > > > > > > > > > l_fc_p_8.1.018)
> > > > > > > > > > > > gcc 3.2.2 (gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5))
> > > > > > > > > > > > hdf5-1.6.2-p4 (built against mpich-1.2.6-p4)
> > > > > > > > > > > >
> > > > > > > > > > > > I have configured both MPICH libraries with-romio filesupport for the
> > > > > > > > > > > > following file systems (pvfs,ufs,nfs)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > _______________________________________________
> > > > > > > > > > > > PVFS-users mailing list
> > > > > > > > > > > > PVFS-users at www.beowulf-underground.org
> > > > > > > > > > > > http://www.beowulf-underground.org/mailman/listinfo/pvfs-users
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > _______________________________________________
> > > > > > > PVFS-users mailing list
> > > > > > > PVFS-users at www.beowulf-underground.org
> > > > > > > http://www.beowulf-underground.org/mailman/listinfo/pvfs-users
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
> > _______________________________________________
> > PVFS-users mailing list
> > PVFS-users at www.beowulf-underground.org
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs-users
> >
>
>
>
More information about the PVFS-users
mailing list