<div dir="ltr">I finally narrowed it down. It turns out we had a problem merging the previous release, but it did not show up since we never got a chance to test it. Sam added an op_release in namei.c to fix a <span class="csComment">kmem_cache leak, and i</span>t sneaked in twice without warning. Taking that out fixed the problem. <br>
<br>Bart.<br><br><br><br><div class="gmail_quote">On Tue, Jul 29, 2008 at 8:03 AM, Phil Carns <span dir="ltr"><<a href="mailto:carns@mcs.anl.gov">carns@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
I'm having a hard time thinking of anything specific that would have impacted this. You could maybe try to narrow it down some by taking a diff of just the src/kernel/linux-2.6 directory and apply that to a 2.7.1 tree to test and see if it is something specifically in the kernel module code.<br>
<br>
-Phil<br>
<br>
Bart Taylor wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">
I ran the test the same way you mentioned - outside of the LTP framework - and still had the problem. I have applied the patch that fixed the rename06 test as well as the kernel buffer overflow fix from a few days ago and still have the problem.<br>
<br>
I did a CVS export of head this morning and used the same configure and build as last time. I ran the open file test against a file system created from head and against a 271 file system (with some recent patches) and both tests succeed, so it seems like the fix is somewhere between the 271 release and head, but I am not sure where. Do you have an idea where it might be lurking?<br>
<br>
Bart.<br>
<br>
<br>
<br></div><div><div></div><div class="Wj3C7c">
On Fri, Jul 25, 2008 at 7:16 AM, Phil Carns <<a href="mailto:carns@mcs.anl.gov" target="_blank">carns@mcs.anl.gov</a> <mailto:<a href="mailto:carns@mcs.anl.gov" target="_blank">carns@mcs.anl.gov</a>>> wrote:<br>
<br>
Phil Carns wrote:<br>
<br>
Bart Taylor wrote:<br>
<br>
I am having a problem with an LTP test from the 20080630 set<br>
of LTP tests. The<br>
'openfile01' test does 10 threaded opens of 10 files. It is<br>
attached in case you<br>
need a copy. The test completes successfully, but an 'ls'<br>
command immediately<br>
after that hangs and cannot be killed. Eventually the node<br>
hangs as well. Any<br>
command that touches the file system will trigger the problem.<br>
<br>
We also tried this with the 2.7.1 release tarball and see<br>
the same problem. A<br>
single node file system running RHEL4 and a 2.6.9-67 kernel.<br>
The client was on<br>
the same node.<br>
<br>
Here is the configure line used:<br>
<br>
./configure --with-kernel=/lib/modules/`uname -r`/build<br>
<br>
and how the client was started:<br>
<br>
./pvfs2-client -p ./pvfs2-client-core<br>
<br>
The fs.conf file is attached.<br>
<br>
The client debug mask was set to 'all', and<br>
/proc/sys/pvfs2/debug had a value of<br>
32767. But once the 'ls' command was issued, there were no<br>
log messages.<br>
<br>
Does anyone else see this error?<br>
<br>
Bart.<br>
<br>
<br>
Are you able to reproduce this running openfile by itself after<br>
a fresh boot? It looks like openfile operates on a file in the<br>
current working directory, so I have been trying to run it like<br>
this:<br>
<br>
<mount pvfs2 on /mnt/pvfs2><br>
cd /mnt/pvfs2<br>
~/openfile -f10 -t10<br>
ls -alh<br>
<br>
So far I haven't had any trouble with that particular<br>
combination. I'm running it on a centos4 box with a very<br>
similar kernel. The openfile tests looks fairly innocent- with<br>
those arguments each of 10 separate threads open the same single<br>
file 10 times (for a total of 100 file descriptors open to the<br>
same file) if I understand correctly.<br>
<br>
If I try to run a full LTP test, however, I do have other<br>
problems. In particular the rename06 test hangs. I can trigger<br>
that one by itself as follows:<br>
<br>
export TMPDIR=/mnt/pvfs2<br>
~/rename06<br>
<br>
The same suite of tests runs fine on a 2.6.24 kernel and a trunk<br>
build of PVFS. I'm not sure yet if the difference is between<br>
pvfs versions or between kernel versions.<br>
<br>
<br>
The rename06 test passes with pvfs trunk; I think that particular<br>
problem has already been fixed. I still haven't figured out why<br>
openfile01 would be a problem, though.<br>
<br>
-Phil<br>
<br>
<br>
</div></div></blockquote>
<br>
</blockquote></div><br></div>