[PVFS2-developers] pvfs2 failover almost there
Robert Latham
robl at mcs.anl.gov
Thu Jun 17 16:19:06 EDT 2004
Hey guys
I've been playing around with pvfs2 high availibility lately, and i've
almost got it working really well. Active-passive already seems to
work, but Active-Active has some issues.
This is with pvfs2-0.5.1, tcp, AIO callbacks, Debian unstable.
I've got two servers, both acting as metadata and io nodes. heartbeat
fires up the pvfs2-servers on both nodes, and a client (a 3rd node)
runs 'pvfs2-cp testfile /pvfs-ha/testfile' (a 1 GB file -- something
that will take significant time to run ).
When one node goes down, the client (pvfs2-cp) reports
Error: bmi_tcp: Connection reset by peer
Warning: BMI attempting reconnect.
Error: bmi_tcp: Connection refused
Error: poorly formatted protocol message received.
Protocol version mismatch: received version 0 when expecting version 501.
Please verify your PVFS2 installation and make sure that the version is
consistent.
msgpairarray decode error: Protocol not supported
PVFS_sys_write: Protocol not supported
Error: short write
or sometimes i get this:
Warning: BMI attempting reconnect.
Error: bmi_tcp: Connection refused
Error: poorly formatted protocol message received.
Too small: message only 0 bytes.
msgpairarray decode error: Protocol error
PVFS_sys_write: Protocol error
Error: short write
Any suggestions?
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA B29D F333 664A 4280 315B
More information about the PVFS2-developers
mailing list