[PVFS-users] FW: PVFS Hangups during concurrent read/writes
Brannen S Hough
bshough at impactsci.com
Thu Aug 5 18:21:08 EDT 2004
Yet more information - not sure if this helps or not - but I set up
a 2 node cluster. I can run multiple test programs concurrently on the
Manager node, I can run multiple test programs concurrently on the other
node (that is only an IONode), but my test programs locked up when running
one on each of the nodes - one doing reads and the other writes.
The strange thing about that is they aren't actually hung. After a
long time (couple of hours) one of the two started up again, right where it
left off, and ran at about normal speed to completion. The other struggled
slowly though a couple of more files, then stopped again for good, and was
in the same place when I killed it a couple of hours later. This gets
weirder and weirder.
Are people using PVFS more as a file server (where writes are
seldom, and most accesses are reads)? Any suggestions are welcome.
- Brannen
> -----Original Message-----
> From: Rob Ross [mailto:rross at mcs.anl.gov]
> Sent: Tuesday, August 03, 2004 2:58 PM
> To: Brannen S Hough
> Cc: pvfs-users at beowulf-underground.org
> Subject: Re: [PVFS-users] FW: PVFS Hangups during concurrent read/writes
>
> Hi Brannen,
>
> Thanks for the problem report and for trying the newest prerelease before
> getting back to us!
>
> What exactly is your test program?
>
> Does this only happen when you have the two iods running on the same
> node?
>
> It would be helpful to us for you to recompile with "-g", attach to the
> pvfsd, and get a stack dump. Also some debugging output could help... try
> 0x077 and see if that gets much into the pvfsdlog file as a start.
>
> The iods don't coordinate read and write operations in a way that would
> cause deadlock, so that shouldn't be it. The mgr just serializes
> everything it does, so that should be ok too. I'm not sure what is going
> on quite yet...
>
> Rob
>
More information about the PVFS-users
mailing list