[Pvfs2-users] PVFS2 Stability

Julian Martin Kunkel Julian.Kunkel at gmx.de
Mon Apr 10 17:09:16 EDT 2006


Hi,
> I've been working with PVFS2 since December, trying to implement it on our
> 60 node (120 CPU) Linux Beowulf Cluster.  
Your error report is quite interesting and reminds me of benchmarks I did a 
while ago on our test cluster with 5 nodes and 10 CPUs (dual Xeon 2Ghz, 1 GB 
ethernet), this seems to be a similar configuration !

I did not use the kernel module. The servers already hung up by a large I/O 
operation similar to pvfs2-cp, just from memory of one (!) node to the 
servers. I was able to reproducible hung a server with one big I/O operation.
However, I ran out of time and for some other reasons I did not finished 
debugging of the pvfs2-server...

That time I thought the reason for the server hung up was our system 
configuration, especially the kernel (2.6.8) which had some problems with our 
intel IDE chipset, or the dual cpu configuration (we did not use 
hyperthreading). 
I digged a bit and it seemed that on a random server the thread responsible 
for dbpf hung up (maybe a deadlock or problems with the asynchronous I/O - 
aio).
In my bachelor thesis I replaced the trove module with an I/O stub and testing 
module which did not hang with the same operatiosn, so I suggest the problem 
must be somewhere in the trove layer.

Maybe it would be good to try pvfs2 without threads to ensure that there are 
no deadlocks or problems with aio. In former days it was possible to compile 
the server with a single thread, but I'm not sure how the server has to be 
compiled nowaday or even if it still works. I only remember that there are 
the following  servercflags in the Makefile which have to be removed/replaced 
(?):
__PVFS2_TROVE_THREADED__ 
__PVFS2_JOB_THREADED__
I would be glad to hear if this is still possible and how :)

Also the configure flag --disable-kernel-aio might help.

I'm currently busy and our cluster is reinstalled with a new configuration, 
but maybe I can try to figure out what is happening at the end of the week.

Best regards,
Julian




More information about the Pvfs2-users mailing list