[Pvfs2-users] pvfs2 stability

Andrea Carotti and.carotti at farmchim.uniba.it
Mon May 22 11:56:39 EDT 2006


Hi Mr.Murali,
of course...here are my files:

cat /home/Application/pvfs/conf/pvfs2-fs.conf
<Defaults>
        UnexpectedRequests 50
        LogFile /tmp/pvfs2-server.log
        EventLogging none
        LogStamp usec
        BMIModules bmi_tcp
        FlowModules flowproto_multiqueue
        PerfUpdateInterval 1000
        ServerJobBMITimeoutSecs 30
        ServerJobFlowTimeoutSecs 30
        ClientJobBMITimeoutSecs 300
        ClientJobFlowTimeoutSecs 300
        ClientRetryLimit 5
        ClientRetryDelayMilliSecs 2000
</Defaults>

<Aliases>
        Alias dom1 tcp://dom1:3334
        Alias dom2 tcp://dom2:3334
        Alias dom3 tcp://dom3:3334
        Alias dom4 tcp://dom4:3334
        Alias om1 tcp://om1:3334
        Alias om2 tcp://om2:3334
        Alias om3 tcp://om3:3334
        Alias om4 tcp://om4:3334
        Alias om5 tcp://om5:3334
</Aliases>

<Filesystem>
        Name pvfs2-fs
        ID 1869706856
        RootHandle 1048576
        <MetaHandleRanges>
                Range om1 4-429496732
        </MetaHandleRanges>
        <DataHandleRanges>
                Range dom1 429496733-858993461
                Range dom2 858993462-1288490190
                Range dom3 1288490191-1717986919
                Range dom4 1717986920-2147483648
                Range om1 2147483649-2576980377
                Range om2 2576980378-3006477106
                Range om3 3006477107-3435973835
                Range om4 3435973836-3865470564
                Range om5 3865470565-4294967293
        </DataHandleRanges>
        <StorageHints>
                TroveSyncMeta yes
                TroveSyncData no
                AttrCacheKeywords datafile_handles,metafile_dist
                AttrCacheKeywords dir_ent, symlink_target
                AttrCacheSize 4093
                AttrCacheMaxNumElems 32768
        </StorageHints>
</Filesystem>

Om1 is the server/client hostname
cat /home/Application/pvfs/conf/pvfs2-server.conf-om1
StorageSpace /pvfs2-storage-space
HostID "tcp://om1:3334"

Om2 is a client hostname
cat /home/Application/pvfs/conf/pvfs2-server.conf-om2
StorageSpace /pvfs2-storage-space
HostID "tcp://om2:3334"


Let me know if you need more informations.
Thanks
Andrea

----- Original Message ----- 
From: "Murali Vilayannur" <vilayann at mcs.anl.gov>
To: "Andrea Carotti" <and.carotti at farmchim.uniba.it>
Cc: <pvfs2-users at beowulf-underground.org>
Sent: Monday, May 22, 2006 5:45 PM
Subject: Re: [Pvfs2-users] pvfs2 stability


> Hi Andrea,
> It does look a bit strange to see these messages and yet have the FS
> working..
> Could you post your fs.conf and server.conf files?
> thanks,
> Murali
>
> On Mon, 22 May 2006, Andrea Carotti wrote:
>
>> Hi all,
>> I'm new to this list and to the pvfs2 program. I'm using it on our home
>> made
>> cluster (9 nodes) running an openMosix kernel 2.4.22-3 and Fedora Core2.
>> I've installed it with one node running as meta server ,  PVFS2 server
>> and
>> data servers and all the others like data servers.
>> I've also compiled and installed the module.
>> This is my actual configuration:
>> 1)on all nodes I've an entry in /etc/fstab like this:
>> tcp://om1:3334/pvfs2-fs /mnt/pvfs2 pvfs2 default,noauto 0 0
>> 2)i've added at the rc.local these lines:
>> insmod /lib/modules/2.4.22-oM3src/kernel/fs/pvfs2/pvfs2.o
>> /home/Application/pvfs/sbin/pvfs2-client -p
>> /home/Application/pvfs/sbin/pvfs2-client-core
>> mount -t pvfs2 tcp://om1:3334/pvfs2-fs /mnt/pvfs2
>> 3) I've enbled the default service for the startup on all the nodes
>> /etc/init.d/pvfs2-server
>>
>> I'm encountering some problems with its usage:
>> if I start the server (/etc/init.d/pvfs2-server start) everything seems
>> ok
>> but on the server the /tmp/pvfs2-client.log appears with this errors:
>>
>> [E 16:57:50.651742] msgpair failed, will retry:: Broken pipe
>> [E 16:57:52.691656] msgpair failed, will retry:: Connection refused
>> [E 16:57:54.731666] msgpair failed, will retry:: Connection refused
>> [E 16:57:56.771657] msgpair failed, will retry:: Connection refused
>> [E 16:57:58.811658] msgpair failed, will retry:: Connection refused
>> [E 16:58:00.851658] msgpair failed, will retry:: Connection refused
>> [E 16:58:00.851731] *** msgpairarray_completion_fn: msgpair to server
>> tcp://om1:3334 failed: Connection refused
>> [E 16:58:00.851750] *** Out of retries.
>> [E 16:58:00.851769] getattr_object_getattr_failure : Connection refused
>>
>> However it seems to work: i can write on the /mnt/pvfs2 , make dirs, and
>> so
>> on with the normal commands cp,mkdir and so on .
>>
>> But during the day something go wrong infact the next day I never can see
>> the /mnt/pvfs2 without restarting the server and looking on the
>> /var/log/messages
>> i see:
>> May 18 23:21:20 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 18 23:27:20 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 19 01:06:07 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 19 04:08:26 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 19 04:15:40 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 19 23:20:48 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 19 23:26:48 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 20 01:06:04 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 20 04:08:25 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 20 04:15:34 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 20 23:21:09 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 20 23:27:09 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 21 01:06:05 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 21 04:08:31 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 21 04:15:41 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 21 23:24:05 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 22 01:06:03 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>> May 22 04:08:33 om1 kernel: pvfs2: pvfs2_statfs -- wait timed out and
>> retries exhausted. aborting attempt.
>> May 22 04:15:41 om1 kernel: pvfs2: pvfs2_inode_getattr -- wait timed out
>> and
>> retries exhausted. aborting attempt.
>>
>> Same errors at the same time.
>> Sorry for the long message...Hope for someone help
>> Thanks
>> Andrea
>>
>>
>>
>> _______________________________________________
>> Pvfs2-users mailing list
>> Pvfs2-users at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
>>



More information about the Pvfs2-users mailing list