[Pvfs2-users] PVFS2 Stability

Luke Hindman platypus at techemail.com
Thu Apr 20 13:06:54 EDT 2006


First I'd like to thank you both for your responses.  I would have responded
sooner, but I got pulled off on another project and this is my first chance
to get back to the pvfs stuff.  Ironically, getting the other project to
work seems to have indirectly fixed my PVFS2 stability issues.  This other
project required me to upgrade the kernel on all our nodes from 2.6.11 to
2.6.12.  I was previously using the bcm5700 (version 8.2.18-1) kernel module
for our Broadcom Gigabit ethernet cards, but since I was upgrading the
kernel I decided to switch to the latest tg3 driver (version 3.43f-1) from
the Broadcom site.

Those were the only system changes I made.  I then recompiled the pvfs2
package and tried recreating the issue and wonder of wonders...  The issue
was gone!!!  I reran all my original tests as well as some more intense
testing, but not a single hung process.  Needless to say, I am thrilled with
this result and look forward to putting this new filesystem through it's
paces.

Thanks for your comments and suggestions,

--Luke


On 4/11/06, Murali Vilayannur <vilayann at mcs.anl.gov> wrote:
>
> Hi Luke,
> Thanks for the report.
> Could you turn on TROVE_DEBUG (i.e. pvfs2-setdebugmask -m .. "trove,
> storage") on the servers and send that (if it is not too large for the
> list or privately)?
> Thanks,
> Murali
>
> On Mon, 10 Apr 2006, Luke Hindman wrote:
>
> > Hi,
> >
> > I've been working with PVFS2 since December, trying to implement it on
> our
> > 60 node (120 CPU) Linux Beowulf Cluster.  We run several applications
> that
> > are I/O intensive, and the performance improvements (read and write)
> over
> > NFS have been phenomenal!  However, I am having a problem with
> stability.
> > Quite frequently, when I run a job that utilizes the pvfs2 filesystem,
> some
> > of my processes will hang indefinitely.  I have tried a variety of
> > applications, including writing a simple shell script using the pvfs2-cp
> > command just to copy files around and a simple MPI2 program using the
> ROMIO
> > interface as well as standard Unix I/O.  In every case I can cause the
> > processes to hang.
> >
> <snipped>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.beowulf-underground.org/pipermail/pvfs2-users/attachments/20060420/305d4adb/attachment.htm


More information about the Pvfs2-users mailing list