[Pvfs2-users] Error running pvfs2-server: Error: handle 0 is
invalid (out of bounds)
Sam Lang
slang at mcs.anl.gov
Tue Jul 1 11:35:00 EDT 2008
Rongrong saw the same problem with DBD 4.7. IIRC we just switched to
using 4.6 instead of debugging the problem. It looks like Berkeley DB
has changed the semantics (added some strictness checking anyway) on
their interfaces. We probably just need to add the DB_DBT_USERMEM
flag to our keys everywhere.
-sam
On Jul 1, 2008, at 10:23 AM, Phil Carns wrote:
> Ok, let us know what you find out. I'll try 4.7 on my end too, but
> I have a feeling you'll get to try 4.6 before I get around to trying
> 4.7 :)
>
> -Phil
>
> Mark J. Hoy wrote:
>> Thanks Phil -
>> I'll try installing db 4.6 (or earlier) this afternoon and see if
>> that makes any difference (we don't need 4.7 for anything on our
>> system at the moment, so switching back won't be any issue)...
>> As for ldd - on all nodes, I'm seeing the same versions:
>> libdb-4.7.so => /usr0/BerkeleyDB.4.7/lib/libdb-4.7.so
>> (0x00002b3225245000)
>> libpthread.so.0 => /lib64/libpthread.so.0 (0x00000030e3300000)
>> librt.so.1 => /lib64/librt.so.1 (0x00000030e9000000)
>> libc.so.6 => /lib64/libc.so.6 (0x00000030e2c00000)
>> /lib64/ld-linux-x86-64.so.2 (0x00000030e2000000)
>> Thanks again!
>> -Mark
>> Phil Carns wrote:
>>> Hi Mark,
>>>
>>> Ok, maybe this is some berkeley db compatibility issue; it looks
>>> like 4.7 is hot off the presses a couple of weeks ago. Maybe
>>> there is something new we need to figure out.
>>>
>>> It is strange that you would only have trouble with one particular
>>> server, though. Could you double check with "ldd ./pvfs2-server"
>>> that they are all linked to the same db library?
>>>
>>> -Phil
>>>
>>> Mark J. Hoy wrote:
>>>> Thanks Phil -
>>>>
>>>> Retrying the re-initialization of the storage space (-f) - says
>>>> it works, but also provides the additional messages:
>>>> [S 07/01 10:54] PVFS2 Server on node boston19 version 2.7.1
>>>> starting...
>>>> [E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
>>>> allocation flag on key DBT
>>>> [E 07/01 10:54] error in dspace create (db_p->get failed).
>>>> [E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
>>>> allocation flag on key DBT
>>>> [E 07/01 10:54] error in dspace create (db_p->get failed).
>>>> [E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
>>>> allocation flag on key DBT
>>>> [E 07/01 10:54] error in dspace create (db_p->get failed).
>>>> [E 07/01 10:54] TROVE:DBPF:Berkeley DB: DB_THREAD mandates memory
>>>> allocation flag on key DBT
>>>> [E 07/01 10:54] error in dspace create (db_p->get failed).
>>>> [D 07/01 10:54] PVFS2 Server: storage space created. Exiting.
>>>>
>>>> This only happens on the very first server (server1 in my
>>>> configuration below) - also, I'm using a standard make/make
>>>> install of BerkeleyDB.4.7 (no odd configuration options other
>>>> than the --prefix setting to install to a different path)...
>>>>
>>>> Running showcoll on the first server yields:
>>>> [E 10:58:15.614847] TROVE:DBPF:Berkeley DB: DB_THREAD mandates
>>>> memory allocation flag on key DBT
>>>> [E 10:58:15.614940] TROVE:DBPF:Berkeley DB: DB->get: Invalid
>>>> argument
>>>> 0x00100000 (dspace_getattr output: type = unknown, b_size =
>>>> 5421632)
>>>> [E 10:58:15.614990] TROVE:DBPF:Berkeley DB: DB_THREAD mandates
>>>> memory allocation flag on key DBT
>>>> [E 10:58:15.615003] TROVE:DBPF:Berkeley DB: DB->get: Invalid
>>>> argument
>>>> 0x00000000 (dspace_getattr output: type = unknown, b_size =
>>>> 5421632)
>>>>
>>>> Running showcoll (with the same parameters) on any of the other 5
>>>> nodes yields:
>>>> [E 11:01:34.794473] src/io/trove/trove-dbpf/dbpf-mgmt.c line 515:
>>>> dbpf_collection_geteattr: DB_NOTFOUND: No matching key/data pair
>>>> found
>>>> [E 11:01:34.794792] [bt] bin/pvfs2-
>>>> showcoll(dbpf_collection_geteattr+0x103) [0x414063]
>>>> [E 11:01:34.794811] [bt] bin/pvfs2-showcoll(main+0x417)
>>>> [0x4070b7]
>>>> [E 11:01:34.794822] [bt] /lib64/libc.so.6(__libc_start_main
>>>> +0xf4) [0x30e2c1d084]
>>>> [E 11:01:34.794834] [bt] bin/pvfs2-showcoll(aio_fsync64+0x39)
>>>> [0x406669]
>>>> Storage space /usr2/pvfs-storage, collection pvfs2-fs (coll_id =
>>>> 1375400306, *** no root_handle found ***):
>>>>
>>>> ... not sure what I'm missing here... Thanks!
>>>>
>>>> -Mark
>>>>
>>>>
>>>> Phil Carns wrote:
>>>>> Hello,
>>>>>
>>>>> I haven't seen that error message in that particular context
>>>>> before. In general, though, it happens on startup when the
>>>>> server finds that it has a handle (storage object) in its
>>>>> directory that doesn't match the ranges specified in the
>>>>> configuration file.
>>>>>
>>>>> In this specific case it thinks there is a handle with value 0
>>>>> in your storage space, which shouldn't happen.
>>>>>
>>>>> Has the server ever started successfully, or is this the first
>>>>> attempt to get it running?
>>>>>
>>>>> You may want to try just deleting the storage space (the /usr2/
>>>>> pvfs-storage directory) and redoing the "-f" step, if you
>>>>> haven't already.
>>>>>
>>>>> You could also try running this command:
>>>>>
>>>>> pvfs2-showcoll -s /usr2/pvfs-storage -c pvfs2-fs
>>>>>
>>>>> That should list all of the handles in the storage space so that
>>>>> we can confirm if there is really bad data in there or if there
>>>>> is something wrong with the server's startup.
>>>>>
>>>>> -Phil
>>>>>
>>>>> Mark J. Hoy wrote:
>>>>>> Hi -
>>>>>>
>>>>>> I'm trying to get PVFS2 version 2.7.1 (latest stable) up and
>>>>>> running - It compiles correctly without issue and to initialize
>>>>>> my storage (via "pvfs2-server -f /path/to/config/file" ) - but
>>>>>> I'm having a problem getting the server to start...
>>>>>>
>>>>>> every time I try running "sbin/pvfs2-server /path/to/config/
>>>>>> file" (where /path/to/config/file is my configuration file
>>>>>> generated via pvfs2-genconfig), I keep getting an error: Error:
>>>>>> handle 0 is invalid (out of bounds)
>>>>>>
>>>>>> The relevant pieces of the log are shown below:
>>>>>> [D 06/27 13:32] PVFS2 Server version 2.7.1 starting.
>>>>>> [E 06/27 13:32] Error: handle 0 is invalid (out of bounds)
>>>>>> [E 06/27 13:32] Error adding handle range
>>>>>> 3-1317624576693539402,2635249153387078803-3952873730080618202
>>>>>> to filesystem pvfs2-fs
>>>>>> [E 06/27 13:32] Error: Could not initialize server interfaces;
>>>>>> aborting.
>>>>>> [E 06/27 13:32] Error: Could not initialize server; aborting.
>>>>>>
>>>>>> This seems to happen both using a single-machine configuration,
>>>>>> and during a cluster configuration (with 6 machines) - _but_ in
>>>>>> the multiple machine configuration, it only happens when I try
>>>>>> and start the first I/O node - the other 5 machines startup
>>>>>> without issue.
>>>>>>
>>>>>> Has anyone else experienced this sort of problem? I've attached
>>>>>> a copy of my configuration file below (but changed the machine
>>>>>> names to protect the innocent). Also, I'm running on a
>>>>>> homogeneous configuration where all six of my machines are
>>>>>> running Fedora Core 5, kernel: Linux version 2.6.19.1-001-K8,
>>>>>> Dual-Core AMD Opteron(tm) Processor (model 1218, 2.6 GHz), 4GB
>>>>>> RAM, and 400 GB of storage on the volume for pvfs2
>>>>>>
>>>>>> <Defaults>
>>>>>> UnexpectedRequests 50
>>>>>> EventLogging none
>>>>>> LogStamp datetime
>>>>>> BMIModules bmi_tcp
>>>>>> FlowModules flowproto_multiqueue
>>>>>> PerfUpdateInterval 1000
>>>>>> ServerJobBMITimeoutSecs 30
>>>>>> ServerJobFlowTimeoutSecs 30
>>>>>> ClientJobBMITimeoutSecs 300
>>>>>> ClientJobFlowTimeoutSecs 300
>>>>>> ClientRetryLimit 5
>>>>>> ClientRetryDelayMilliSecs 2000
>>>>>>
>>>>>> StorageSpace /usr2/pvfs-storage
>>>>>> LogFile /tmp/pvfs2-server.log
>>>>>> </Defaults>
>>>>>> <Aliases>
>>>>>> Alias server1 tcp://server1:3334
>>>>>> Alias server2 tcp://server2:3334
>>>>>> Alias server3 tcp://server3:3334
>>>>>> Alias server4 tcp://server4:3334
>>>>>> Alias server5 tcp://server5:3334
>>>>>> Alias server6 tcp://server6:3334
>>>>>> </Aliases>
>>>>>> <Filesystem>
>>>>>> Name pvfs2-fs
>>>>>> ID 1375400306
>>>>>> RootHandle 1048576
>>>>>> <MetaHandleRanges>
>>>>>> Range server1 3-1152921504606846977
>>>>>> Range server6
>>>>>> 1152921504606846978-2305843009213693952
>>>>>> </MetaHandleRanges>
>>>>>> <DataHandleRanges>
>>>>>> Range server1
>>>>>> 2305843009213693953-3458764513820540927
>>>>>> Range server2
>>>>>> 3458764513820540928-4611686018427387902
>>>>>> Range server3
>>>>>> 4611686018427387903-5764607523034234877
>>>>>> Range server4
>>>>>> 5764607523034234878-6917529027641081852
>>>>>> Range server5
>>>>>> 6917529027641081853-8070450532247928827
>>>>>> Range server6
>>>>>> 8070450532247928828-9223372036854775802
>>>>>> </DataHandleRanges>
>>>>>> <StorageHints>
>>>>>> TroveSyncMeta yes
>>>>>> TroveSyncData no
>>>>>> </StorageHints>
>>>>>> </Filesystem>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pvfs2-users mailing list
>>>>>> Pvfs2-users at beowulf-underground.org
>>>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>
> _______________________________________________
> Pvfs2-users mailing list
> Pvfs2-users at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
More information about the Pvfs2-users
mailing list