[Pvfs2-developers] msgpair error
Sam Lang
slang at mcs.anl.gov
Wed Jun 21 16:01:43 EDT 2006
On Jun 168, at 1:28PM, Sam Lang wrote:
>
> On Jun 16, 2006, at 6:44 PM, Pete Wyckoff wrote:
>
>> slang at mcs.anl.gov wrote on Fri, 16 Jun 2006 17:05 -0500:
>>> I haven't been able to reproduce this specific error or get
>>> dbench to
>>> fail for me, but Murali just committed a bunch of fixes to bugs in
>>> the keyval changes I made recently that I could imagine causing a
>>> lot
>>> of weird behaviors. If you find that these commits fix the dbench
>>> errors you were seeing, let me know. I'd prefer to wait to cleanup
>>> the error handling on the server (db->err and the weird EIO case) as
>>> well as the msgpairarray problems until after the release.
>>
>> That does change things, but it's still a bit funky.
>>
>> My configure line looks like (straight tcp, no ib):
>>
>> CFLAGS=-g ../pvfs2/configure \
>> --prefix=/usr/local/pvfs2-test \
>> --enable-shared \
>> --enable-segv-backtrace \
>> --enable-epoll \
>> --with-kernel=/usr/src/kernel/linux-2.6.16
>>
>> Then I start 1 server with this config:
>>
>> <Defaults>
>> UnexpectedRequests 50
>> LogFile /tmp/pbstmp.1225/pvfs2.log
>> EventLogging none
>> LogStamp usec
>> BMIModules bmi_tcp
>> FlowModules flowproto_multiqueue
>> PerfUpdateInterval 1000
>> ServerJobBMITimeoutSecs 30
>> ServerJobFlowTimeoutSecs 30
>> ClientJobBMITimeoutSecs 300
>> ClientJobFlowTimeoutSecs 300
>> ClientRetryLimit 5
>> ClientRetryDelayMilliSecs 2000
>> </Defaults>
>>
>> <Aliases>
>> Alias ib14 tcp://ib14:3334
>> </Aliases>
>>
>> <Filesystem>
>> Name pvfs2-fs
>> ID 1990198648
>> RootHandle 1048576
>> <MetaHandleRanges>
>> Range ib14 4-4294967297
>> </MetaHandleRanges>
>> <DataHandleRanges>
>> Range ib14 4294967298-8589934591
>> </DataHandleRanges>
>> <StorageHints>
>> TroveSyncMeta no
>> TroveSyncData no
>> CoalescingHighWatermark infinity
>> CoalescingLowWatermark 1
>> </StorageHints>
>> </Filesystem>
>>
>> I'm using dbench-3.03 from the robl io tests distribution, with
>> command line, from a different machine, single client:
>>
>> ./dbench -D /pvfs -t 20 -c ./client.txt 10
>>
>> Everything finishes now, but all the threads complain. They had
>> not previously done this in the past:
>>
>> /bin/rm: cannot remove directory `/pvfs/clients/client0': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client5': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client8': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client1': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client3': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client6': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client7': No such
>> file or directory
>> /bin/rm: cannot remove directory `/pvfs/clients/client9': No such
>> file or directory
>>
>> Hope this helps; at least it does not lock up my machines anymore.
>> I'm only picking on you since I saw some dbpf checkins that seem to
>> have been related (the -EIO issue).
>>
>
> I'm able to get something similar with dbench now after that
> commit, so I'll look into that. I'm not sure about the EIO thing,
> it doesn't seem like we should return EIO if the get of a dspace
> object fails, but I think that may be code left over from Neill's
> first implementation, so he may have done it for some reason that
> I've missed.
>
Just committed a (painfully simple) fix for this. dbench seems to be
working again.
-sam
> -sam
>
>
>> -- Pete
>>
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list