[PVFS-users] mmargo
Ron W. Green
rwgree at sandia.gov
Fri Apr 2 22:18:54 EST 2004
we're using 1.6.2 with the january patch and are seeing these enqueue
messages.
if it helps, we have 236 client nodes talking to the one mgr node, and
have 6 iod nodes. Is it possible that we need a much deeper queue depth
to accommodate long latencies in talking to the mgr? Are the clients
spilling out of their local queues with requests waiting on mgr? I
suspect it may be a scaling issue.
thanks, I do appreciate the work being done on PVFS. It is improving.
ron
Nathan Poznick wrote:
>_______________________________________________
>PVFS-users mailing list
>PVFS-users at www.beowulf-underground.org
>http://www.beowulf-underground.org/mailman/listinfo/pvfs-users
>
>
>
>
> ------------------------------------------------------------------------
>
> Date:
> Fri, 2 Apr 2004 10:27:54 -0700
>
>
>------------------------------------------------------------------------
>
>Thus spake Ron W. Green:
>
>
>>Martin,
>>
>>We seem to get those "failed on enqueue" quite often. Of course, our
>>cluster is much bigger too. I've scratched my head on this, and looked
>>at the code. The best I can tell it is when the pvfs client attempts a
>>metadata operation to the mgr node. I suspect that the mgr is slow in
>>responding and/or has run out of queueing space to enqueue the metadata
>>operation request (create or stat).
>>
>>Anyone on the list know if mgr has a fixed queue size? Or can we jack
>>up the client timeouts? Multithread mgr?
>>
>>From our testing we're quite convinced the problem lies in mgr, that it
>>can't keep up with metadata requests from the clients.
>>
>>
>
>Actually those messages are not referring to any sort of queuing on the
>manager at all - they refer to the pvfsdev_enqueue/dequeue functions in
>the kernel module which add/remove messages from the /dev/pvfs-req
>device.
>
>
>
>
--
Ron W. Green
rwgree at sandia.gov
+1-505-284-1600
Sr. Engineer, ICC Applications Support
More information about the PVFS-users
mailing list