[Pvfs2-developers] Error with concurrent opens
Phil Carns
carns at mcs.anl.gov
Fri Aug 1 10:25:19 EDT 2008
Doh- sorry to hear about the merge problem, but I am relieved to know we
don't have another bug floating around on this path! Thanks for the update.
-Phil
Bart Taylor wrote:
> I finally narrowed it down. It turns out we had a problem merging the
> previous release, but it did not show up since we never got a chance to
> test it. Sam added an op_release in namei.c to fix a kmem_cache leak,
> and it sneaked in twice without warning. Taking that out fixed the problem.
>
> Bart.
>
>
>
> On Tue, Jul 29, 2008 at 8:03 AM, Phil Carns <carns at mcs.anl.gov
> <mailto:carns at mcs.anl.gov>> wrote:
>
> I'm having a hard time thinking of anything specific that would have
> impacted this. You could maybe try to narrow it down some by taking
> a diff of just the src/kernel/linux-2.6 directory and apply that to
> a 2.7.1 tree to test and see if it is something specifically in the
> kernel module code.
>
> -Phil
>
> Bart Taylor wrote:
>
> I ran the test the same way you mentioned - outside of the LTP
> framework - and still had the problem. I have applied the patch
> that fixed the rename06 test as well as the kernel buffer
> overflow fix from a few days ago and still have the problem.
>
> I did a CVS export of head this morning and used the same
> configure and build as last time. I ran the open file test
> against a file system created from head and against a 271 file
> system (with some recent patches) and both tests succeed, so it
> seems like the fix is somewhere between the 271 release and
> head, but I am not sure where. Do you have an idea where it
> might be lurking?
>
> Bart.
>
>
>
> On Fri, Jul 25, 2008 at 7:16 AM, Phil Carns <carns at mcs.anl.gov
> <mailto:carns at mcs.anl.gov> <mailto:carns at mcs.anl.gov
> <mailto:carns at mcs.anl.gov>>> wrote:
>
> Phil Carns wrote:
>
> Bart Taylor wrote:
>
> I am having a problem with an LTP test from the
> 20080630 set
> of LTP tests. The
> 'openfile01' test does 10 threaded opens of 10 files.
> It is
> attached in case you
> need a copy. The test completes successfully, but an 'ls'
> command immediately
> after that hangs and cannot be killed. Eventually
> the node
> hangs as well. Any
> command that touches the file system will trigger the
> problem.
>
> We also tried this with the 2.7.1 release tarball and see
> the same problem. A
> single node file system running RHEL4 and a 2.6.9-67
> kernel.
> The client was on
> the same node.
>
> Here is the configure line used:
>
> ./configure --with-kernel=/lib/modules/`uname -r`/build
>
> and how the client was started:
>
> ./pvfs2-client -p ./pvfs2-client-core
>
> The fs.conf file is attached.
>
> The client debug mask was set to 'all', and
> /proc/sys/pvfs2/debug had a value of
> 32767. But once the 'ls' command was issued, there
> were no
> log messages.
>
> Does anyone else see this error?
>
> Bart.
>
>
> Are you able to reproduce this running openfile by itself
> after
> a fresh boot? It looks like openfile operates on a file
> in the
> current working directory, so I have been trying to run
> it like
> this:
>
> <mount pvfs2 on /mnt/pvfs2>
> cd /mnt/pvfs2
> ~/openfile -f10 -t10
> ls -alh
>
> So far I haven't had any trouble with that particular
> combination. I'm running it on a centos4 box with a very
> similar kernel. The openfile tests looks fairly
> innocent- with
> those arguments each of 10 separate threads open the same
> single
> file 10 times (for a total of 100 file descriptors open
> to the
> same file) if I understand correctly.
>
> If I try to run a full LTP test, however, I do have other
> problems. In particular the rename06 test hangs. I can
> trigger
> that one by itself as follows:
>
> export TMPDIR=/mnt/pvfs2
> ~/rename06
>
> The same suite of tests runs fine on a 2.6.24 kernel and
> a trunk
> build of PVFS. I'm not sure yet if the difference is between
> pvfs versions or between kernel versions.
>
>
> The rename06 test passes with pvfs trunk; I think that particular
> problem has already been fixed. I still haven't figured out why
> openfile01 would be a problem, though.
>
> -Phil
>
>
>
>
More information about the Pvfs2-developers
mailing list