[Pvfs2-users] Problem with restarting pvfs on a cluster

Giammanco Raimondo giamma at vki.ac.be
Tue Oct 9 10:40:40 EDT 2007


Hello Mr. Ross,

 thanks for your prompt reply.

 I believe the config file you mention is (for my case)  
/etc/pvfs2-server.conf-master-pvfs.
its contents are:
############################
StorageSpace /pvfs2-storage-space
HostID "tcp://master-pvfs:3334"
LogFile /tmp/pvfs2-server.log
############################

The config file for a node, /etc/pvfs2-server.conf-node1-pvfs for 
example, is the following:
############################
StorageSpace /pvfs2-storage-space
HostID "tcp://node1-pvfs:3334"
LogFile /tmp/pvfs2-server.log
############################

Now, this /pvfs2-storage-space is unfortunately directly on the /, so 
the wrong
mount timing theory is unfortunately to discard.

On the nodes instead /pvfs2-storage-space it is on a mounted filesystem, 
/dev/md1,
but there all goes apparently right, so it seems to me that really there 
is a problem
with the master node and metadata server.

The suggestion given by the log of pvfs2-server binary of using the -f 
option looks
very dangerous to me, or in case of the metadata server it is ok, in the 
sense that
it will reconstruct the data from the IO nodes? I cannot understand why
the different storages have the same directory in common "744468fe",
but the master has nothing else beside this empty directory.

Even if the pvfs2-server process had been killed in a not clean way on 
the master and metadata server,
it would not have been able (I assume) to delete data on the storage 
directory...

So this absence of data in  /pvfs2-storage-space for the metadata server 
is both disconcerting and confusing...

Hope this mail will help us to proceed further.

Best Regards
Raimondo

Rob Ross wrote:
> Hi Raimondo,
>
> Two things. One, there is a second config file around that specifies 
> the storage directory etc. You should be able to find it in /etc/ 
> also. Please send that to us.
>
> An idea is that perhaps /pvfs2-storage-space is a mounted file system, 
> and that somehow it is getting mounted *after* the server is started? 
> Just a blind guess. If you try to start the service after the system 
> has finished booting, does it do the same thing?
>
> Thanks,
>
> Rob
>
> Raimondo Giammanco wrote:
>> Hello, there.
>>
>>  I am coming here seeking words of wisdom. I have looked the interweb 
>> and
>> this list but I cannot seem to find useful informations, so I post here.
>> I apologize if the answer to the question has already been provided 
>> and I
>> could not find it.
>>
>> I have a problem with a pvfs2 installation that has been set-up by a 
>> third
>> person. The cluster has been shutdown cleanly for a scheduled 
>> maintenance
>> on the power lines, and I cannot bring pvfs2 up again.
>>
>> Here is the description.
>>
>> There is a cluster using a fronted and 9 nodes.
>>
>> As far as I understand, the fronted is a metadata server, and the nodes
>> are IO servers, as for the /etc/pvfs2-fs.conf file I present here below:
>>
>> ####################
>> <Defaults>
>>         UnexpectedRequests 50
>>         EventLogging none
>>         LogStamp datetime
>>         BMIModules bmi_tcp
>>         FlowModules flowproto_multiqueue
>>         PerfUpdateInterval 1000
>>         ServerJobBMITimeoutSecs 30
>>         ServerJobFlowTimeoutSecs 30
>>         ClientJobBMITimeoutSecs 300
>>         ClientJobFlowTimeoutSecs 300
>>         ClientRetryLimit 5
>>         ClientRetryDelayMilliSecs 2000
>> </Defaults>
>>
>> <Aliases>
>>         Alias master-pvfs tcp://master-pvfs:3334
>>         Alias node1-pvfs tcp://node1-pvfs:3334
>>         Alias node2-pvfs tcp://node2-pvfs:3334
>>         Alias node3-pvfs tcp://node3-pvfs:3334
>>         Alias node4-pvfs tcp://node4-pvfs:3334
>>         Alias node5-pvfs tcp://node5-pvfs:3334
>>         Alias node6-pvfs tcp://node6-pvfs:3334
>>         Alias node7-pvfs tcp://node7-pvfs:3334
>>         Alias node8-pvfs tcp://node8-pvfs:3334
>>         Alias node9-pvfs tcp://node9-pvfs:3334
>> </Aliases>
>>
>> <Filesystem>
>>         Name pvfs2-fs
>>         ID 1950640382
>>         RootHandle 1048576
>>         <MetaHandleRanges>
>>                 Range master-pvfs 4-429496732
>>         </MetaHandleRanges>
>>         <DataHandleRanges>
>>                 Range node1-pvfs 429496733-858993461
>>                 Range node2-pvfs 858993462-1288490190
>>                 Range node3-pvfs 1288490191-1717986919
>>                 Range node4-pvfs 1717986920-2147483648
>>                 Range node5-pvfs 2147483649-2576980377
>>                 Range node6-pvfs 2576980378-3006477106
>>                 Range node7-pvfs 3006477107-3435973835
>>                 Range node8-pvfs 3435973836-3865470564
>>                 Range node9-pvfs 3865470565-4294967293
>>         </DataHandleRanges>
>>         <StorageHints>
>>                 TroveSyncMeta yes
>>                 TroveSyncData no
>>         </StorageHints>
>> </Filesystem>
>> ####################
>>
>> The nodes are apparently working correctly, at boot the 
>> /etc/init.d/pvfs2
>> script worked and the log file (/tmp/pvfs2-server.log) gives me for a
>> node:
>> ####################
>> [D 10/08 14:39] PVFS2 Server version 2.6.2 starting.
>> ####################
>>
>> on the master instead, it gives
>> ####################
>> [D 10/09 11:09] PVFS2 Server version 2.6.2 starting.
>> [E 10/09 11:09] Error: trove_initialize: No such file or directory
>> [E 10/09 11:09]
>> ***********************************************
>> [E 10/09 11:09] Invalid Storage Space: /pvfs2-storage-space
>>
>> [E 10/09 11:09] Storage initialization failed.  The most common reason
>> for this is that the storage space has not yet been
>> created or is located on a partition that has not yet
>> been mounted.  If you'd like to create the storage space,
>> re-run this program with a -f option.
>> [E 10/09 11:09]
>> ***********************************************
>> [E 10/09 11:09] Error: Could not initialize server interfaces; aborting.
>> [E 10/09 11:09] Error: Could not initialize server; aborting.
>> ####################
>>
>> Now, the storage space on the nodes is full:
>> ####################
>> [root at node1 ~]# ls /pvfs2-storage-space/
>> 744468fe  collections.db  lost+found  storage_attributes.db
>> ####################
>> on the master (frontend) not:
>> ####################
>> [root at master ~]# ls /pvfs2-storage-space/
>> 744468fe
>> ####################
>>
>> Anyone can point me in the right direction?
>>
>> Thanks Again
>>
>> Raimondo
>> _______________________________________________
>> Pvfs2-users mailing list
>> Pvfs2-users at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: giamma.vcf
Type: text/x-vcard
Size: 378 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-users/attachments/20071009/646b5f73/giamma.vcf


More information about the Pvfs2-users mailing list