[Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure
Mark Van De Vyver
mvyver at gmail.com
Sat Mar 3 18:26:19 EST 2007
Hi Steve,
Thanks for your response - what you simulate looks pretty close to
what is going on here - so you are right it could be something else...
network, memory, cpus, hdd..... :(
Appreciate your efforts
Regards
Mark
On 3/3/07, Steve <steve at bov.nu> wrote:
> Hi,
>
>
>
> Yes, last night I was copying and comparing repeatedly a 300meg file on one
> client, the other client/samba box was pulling files off via a windows xp
> machine which was copying them back to another directory back on the share.
> Whilst my I/O box is 2 pvfs2 services in one this should simulate 2 separate
> servers pretty well.
>
>
>
> We have been using the pvfs2 store for several weeks, whilst the usage isnt
> heavy, we have on many occasions had simultaneous reads and writes,
> sometimes simultaneous writes. Its very low end hardware and files range
> from 6meg to 4gb currently with 570gb in a 1tb store. Ive seen no corruption
>
>
>
>
> You seem to have something strange going on.
>
>
>
> Steve
>
>
>
>
>
>
>
> -------Original Message-------
>
>
>
> From: Mark Van De Vyver
>
> Date: 03/03/2007 02:28:05
>
> To: Steve
>
> Cc: pvfs2-users at beowulf-underground.org
>
> Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure
>
>
>
> Hi Steve
>
>
>
> > I will try the script in the meantime I have tried to hammer it this
>
> > afternoon copying a 500meg ISO and repeatedly doing cmp, I saw no errors.
>
> > Also it came to me that when I copied the 500gb in I used a mirroring
>
> > application which would have highlighted any bad copies as files updating,
> I
>
> > saw none.
>
>
>
> I can also see no errors if I just have one machine copying/verifying
>
> To the PVFS2 area. Is your error free run from a case when several
>
> Machine are accessing/writing to the one PVFS2 area?
>
>
>
> Regards
>
> Mark
>
>
>
> >
>
> >
>
> >
>
> >
>
> >
>
> > -------Original Message-------
>
> >
>
> >
>
> >
>
> > From: Mark Van De Vyver
>
> >
>
> > Date: 02/03/2007 19:17:40
>
> >
>
> > To: Steve
>
> >
>
> > Cc: pvfs2-users at beowulf-underground.org
>
> >
>
> > Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure
>
> >
>
> >
>
> >
>
> > Hi Steve,
>
> >
>
> > I don't have access to the cluster now, but the following script has a
>
> >
>
> > Few fixes.
>
> >
>
> > I haven't yet tested copying from a non-pvfs area to pvfs with pvfs 2.6.2.
>
>
> >
>
> > I saw something similar in pvfs 1.5.1 when copyinf from a tmpfs area to
>
> > pvfs2.
>
> >
>
> > Running `mount` should show you if the dvd is auto-mounted and under
>
> >
>
> > What directory, in which case my mount below is redundant and you'll
>
> >
>
> > Need to replace the '/media/cdrom/' references.
>
> >
>
> >
>
> >
>
> > # untested script start
>
> >
>
> > Mkdir /media/cdrom
>
> >
>
> > # you may have to insert your systems dev name here
>
> >
>
> > Mount /dev/hdb /media/cdrom
>
> >
>
> >
>
> >
>
> > For fn in `ls /media/cdrom/*.*|sed -e 'S/\/media\/cdrom\///G`
>
> >
>
> > Do
>
> >
>
> > If [ -f "/mnt/pvfs2/${fn}" ]
>
> >
>
> > Then
>
> >
>
> > # This should 'fail' more frequently than the cmp in the else clause
>
> >
>
> > Cmp /media/cdrom/${fn} /mnt/pvfs2/${fn}
>
> >
>
> > If [ $? != 0 ]
>
> >
>
> > Then
>
> >
>
> > Echo "Prexisting copy not exact - more frequent and random?"
>
> >
>
> > If
>
> >
>
> > Else
>
> >
>
> > Cp /media/cdrom/${fn} /mnt/pvfs2/${fn}
>
> >
>
> > Cmp /media/cdrom/${fn} /mnt/pvfs2/${fn}
>
> >
>
> > If [ $? != 0 ]
>
> >
>
> > Then
>
> >
>
> > Echo " Initial copy not exact - less frequent and random"
>
> >
>
> > If
>
> >
>
> > If
>
> >
>
> > Done
>
> >
>
> > # untested script end
>
> >
>
> >
>
> >
>
> > Thanks
>
> >
>
> > Mark
>
> >
>
> >
>
> >
>
> > On 3/2/07, Steve <steve at bov.nu> wrote:
>
> >
>
> > > Well I thought id try manual cp
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > I never mounted a dvd under link only cdrom. I mounted a movie dvd and
> get
>
> >
>
> >
>
> > > an I/O error when trying to copy. I mounted a data dvd burned under
>
> > windows
>
> >
>
> > > and the mount fails as wrong filesystem.
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > Whats your mount command syntax ?
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > BTW do you get the same if you copy your files to local non pvfs2 disk
> and
>
> >
>
> >
>
> > > then use your script ?
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > -------Original Message-------
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > From: Mark Van De Vyver
>
> >
>
> > >
>
> >
>
> > > Date: 02/03/2007 09:40:30
>
> >
>
> > >
>
> >
>
> > > To: Steve
>
> >
>
> > >
>
> >
>
> > > Cc: pvfs2-users at beowulf-underground.org
>
> >
>
> > >
>
> >
>
> > > Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > Thanks Steve,
>
> >
>
> > >
>
> >
>
> > > I don't see any problem until I run the diff or cmp and even then
>
> >
>
> > >
>
> >
>
> > > These indicate the files are identical if the cmp is run _immediately_
>
> >
>
> > >
>
> >
>
> > > After the file copy.
>
> >
>
> > >
>
> >
>
> > > Cmp and diff only indicate a difference when a file is 'checked' after
>
> >
>
> > >
>
> >
>
> > > Some other files have been copied-checked.
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > The files are from the NYSE trade and quote (TAQ) DVD's, so they are
>
> >
>
> > >
>
> >
>
> > > Text stored as binary.
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > You might be able to try the following with a dozen or so large binary
>
> >
>
> > >
>
> >
>
> > > Files, I have approx 300-400GB stored in the PVFS area.
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > Ideally the following should be run on two or more PVFS2 servers at
>
> >
>
> > >
>
> >
>
> > > The same time, apply this to several DVD's that have not been copied
>
> >
>
> > >
>
> >
>
> > > To the PVFS area, then reapply the script to the same DVD's after they
>
> >
>
> > >
>
> >
>
> > > Have been copied.
>
> >
>
> > >
>
> >
>
> > > The following is a slightly simplified version of my script - here I
>
> >
>
> > >
>
> >
>
> > > Don't delete and re-copy when an existing file fails the cmp
>
> >
>
> > >
>
> >
>
> > > Verification:
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > # untested script start
>
> >
>
> > >
>
> >
>
> > > For fn in `ls /dvd/*large.bin|sed -e 'S/\/dev\//G`
>
> >
>
> > >
>
> >
>
> > > Do
>
> >
>
> > >
>
> >
>
> > > If [ -f /mnt/pvfs2/${fn} ]
>
> >
>
> > >
>
> >
>
> > > Then
>
> >
>
> > >
>
> >
>
> > > # This should 'fail' more frequently than the cmp in the else clause
>
> >
>
> > >
>
> >
>
> > > Cmp ${fn} /mnt/pvfs2/${fn}
>
> >
>
> > >
>
> >
>
> > > If [ $? != 0 ]
>
> >
>
> > >
>
> >
>
> > > Then
>
> >
>
> > >
>
> >
>
> > > Echo "Prexisting copy not exact - more frequent and random?"
>
> >
>
> > >
>
> >
>
> > > If
>
> >
>
> > >
>
> >
>
> > > Else
>
> >
>
> > >
>
> >
>
> > > Cp ${fn} /mnt/pvfs2/${fn}
>
> >
>
> > >
>
> >
>
> > > Cmp ${fn} /mnt/pvfs2/${fn}
>
> >
>
> > >
>
> >
>
> > > If [ $? != 0 ]
>
> >
>
> > >
>
> >
>
> > > Then
>
> >
>
> > >
>
> >
>
> > > Echo " Initial copy not exact - less frequent and random"
>
> >
>
> > >
>
> >
>
> > > If
>
> >
>
> > >
>
> >
>
> > > Done
>
> >
>
> > >
>
> >
>
> > > # untested script end
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > Regards
>
> >
>
> > >
>
> >
>
> > > Mark
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > On 3/2/07, Steve <steve at bov.nu> wrote:
>
> >
>
> > >
>
> >
>
> > > > My setup is a little different in that at the moment I have 2 I/O
>
> > services
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > > running on one box, a metadata on another and a client/samba server on
> a
>
> >
>
> >
>
> > >
>
> >
>
> > > > third. I have moved in the data via samba. We have copied in mp3's and
>
>
> >
>
> > >
>
> >
>
> > > > avi/mpg's as well as large ISO's plus software exe's. Surely after
>
> > several
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > > > week of use we would notice some problem ?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > I do have another box set up as a client that happens to have a dvd
> ROM
>
> >
>
> > >
>
> >
>
> > > > drive in it.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > What type of files ? A vob ?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > What sequence of commands would I need to do you test your problem ?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > If I get a little spare time I could try for U ?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Steve
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > -------Original Message-------
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > From: Mark Van De Vyver
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Date: 02/03/2007 08:18:11
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > To: Steve
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Hi Steve,
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Not sure if this helps any but I have copied over 500gb of media
> files
>
> >
>
> >
>
> > > to
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > pvfs2 running on old dell's 533 to 866 CPU with very little ram
>
> > running
>
> >
>
> > > on
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > caos3 beta 3. Although I havent done any checks other than using the
>
>
> >
>
> > > media
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I havent noticed any problems.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > The failures might be spurious....?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Could you have problems with the dvd device ?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > I doubt it - but it may not be impossible?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > This happens with the DVD drives on all three nodes, and when I just
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Have one node 'working the diif/cmp failures either don't occur or
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Very, very rarely. Start all three nodes 'working' and I see roughly
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > 1 out of 2 binary files fail the initial diff/cmp check, but very very
>
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Few (one every couple of DVD's fail the cmp/diff check immediately
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > After the copy is done.....
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Thanks
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > Mark
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > -------Original Message-------
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > From: Mark Van De Vyver
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Date: 02/03/2007 03:26:40
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > To: pvfs2-users at beowulf-underground.org
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Subject: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Hi,
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > This is a follow up on an earlier email where I reported that PVFS
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > 1.5.1 failed copy binary files from several DVD's.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I'm running a 3 node Rocks 4.2.1 Cluster, CentOS4.4, x86_64, nodes
> are
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Connected via an unmanaged switch.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I have reinstalled the Rocks Cluster (all nodes), including the
> PVFS2
>
> >
>
> > > Roll
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The cluster is set up with the frontend as the metadaat server and
> the
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Other two nodes are PVFS2 I/O servers and clients. The /mnt.pvfs2
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Area is on a 3 disk RAID 0 partition formatted as ext3.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > After installing I ran the test steps in the "PVFS2 Quick Start
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Guide". The test steps ran without error.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I upgraded to PVFS 2.6.2 on all nodes and re-ran the test steps,
> again
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > No errors or problems.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I build PVFS 2.6.2 with the following:
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > ./configure --with-kernel=</path/to/kernel26/>
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > --enable-kernel-sendfile --prefix=/usr/local/pvfs2/
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Then type
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Make all
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Make kmod_install
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Make install
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > On each node I have a script that lists the files on the DVD disc
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Loaded on that node.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Each file is copied if it does not exist on the HDD (PVFS area) and
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The copy is immediately verified:
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Cp /dvd/file1 /mnt/pvfs2/file1
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Cmp /dvd/file1 /mnt/pvfs2/file1
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > `cmp` does not report any error.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > This has been done for 60-70 DVD.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > If I insert a DVD that has previously been copied my script finds
> that
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > A file exists in the PVFS area and does a `cmp` with the DVD file,
> if
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The file fails this comparison the file is deleted, copied, verified
>
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > (cmp).
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I notice that frequently and randomly the previously copied files
> will
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Fail the _initial_ `cmp` check if more than one node is 'active', I
> e.
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Processing a DVD.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Once deleted and copied the second `cmp` check is passed.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Some details:
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The files do not fail the `cmp` check immediately after being copied
> -
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Only when checking a previously copied file.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The `cmp` result indicates a different byte at which the files
> differ.
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Re-inserting the same dvd several times results if different files
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Failing the first `cmp` check.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The second check (immediately after the copy is finished) is always
>
> >
>
> > > passed
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > This occurs rarely, if at all (I.e. I haven't noticed it), when only
>
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > One node is processing a DVD.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > This only occurs with binary files - which are relatively large
> 200MB
>
> > -
>
> >
>
> > > 2
>
> >
>
> > >
>
> >
>
> > > > GB
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > This never occurs with text files - which are also small 100'sKB
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The pvfs2-client.log file is empty on each node.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I have tried using diff and experience the same results.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > This is similar to an error I was seeing in PVFS 1.5.1 - hence the
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Upgrade. I've also changed my previous script which `dd` copied the
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > DVD to memory (approx 8GB), then wrote this ISO file to the PVFS2
> area
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > - this worked fine for initial copies, but failed for re-copies. At
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > That time I wasn't verifiying the copy, so it was the copy to the
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > PVFS2 area that failed.....
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Finally, on one occasion when manually running `cmp` on a file I
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Noticed the following sequence.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Cmp file1 file2 (pass)
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Cmp file1 file2 (pass)
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Difffile1 file2 (fail)
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Cmp file1 file2 (fail)
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Is this known behavior with a known workaround/configuration
> setting?
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > The behavior I see made me guess a caching or network issue (there
> are
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > No other machines on the cluster network).
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Can anyone suggest PVFS configuration settings that will make PVFS
>
> > more
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > robust.
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I'm not a programmer or linux guru - I just spent this summer
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Converting from winxp...
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > I'm happy to explore some possible fixes, but don't assume too much
> :)
>
> >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Thanks in advance
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Mark
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > _______________________________________________
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Pvfs2-users mailing list
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > Pvfs2-users at beowulf-underground.org
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> > >
>
> >
>
> >
>
> >
>
>
>
More information about the Pvfs2-users
mailing list