[Pvfs2-developers] Trove patches

Sam Lang slang at mcs.anl.gov
Thu Jul 20 12:10:44 EDT 2006


I've made some comments to the minor changes.  See below.  Still  
looking at the big changes.

-sam

On Jul 18, 2006, at 4:00 PM, Julian Martin Kunkel wrote:

> Hi,
> enclosed you will find patches for the following issues:
>
> Major changes:
> * sync-coalesce:
> If the last couple of operations which need to be synced finish  
> with an error
> then the other operations can be stalled, too due to the handling  
> by the
> request scheduler. (I have found this bug a couple of weeks ago but  
> did not
> know what caused the race condition). It took me quite a while to  
> track that
> bug...
> The policy right now ensures that error requests never are enqueued.
> (Note: should we flush the db in cases of an error ?)
> *deleting in the background
> Files are during the deletion renamed an deleted in the background  
> to shorten
> response time of the server in case the file is very big...
> Note: people have to put the storage dir into one filesystem.
> * Trove multique support:
> Now there is a thread for metadata ro, metadata rw, I/O and for  
> deleting in
> the background.
> This patch improves the throughput for read only ops, while write  
> ops happen.
> Usually read ops are expected to be cached effectively, while all  
> write ops
> force disk operations.
> * Trove queues are changed to support an internal number of queued  
> elems.
> Also I removed a few functions, added more prefixes with _nolock.
> The make move_op_to_completion_queue is removed also some other  
> functions  are
> replaced with dbpf_move_op_to_completion_queue and  
> dbpf_op_pop_front_nolock.
> * dbpf_dspace_cancel modified to guarantee that operation is not  
> finished
> right now, also change to use id_safe_gen instead of fast_gen. This  
> changes
> make the interface more useful and remove some dependencies on the  
> usage from
> the upper layers. Also test, testsome changed that they can be  
> called on the
> same time without possible memleaks / segfaults.
>
> Minor changes:
> * added define for mkdir syscall in dbpf.h
> * added dbpf_op_get_status (and set status) to change status of  
> return value
> atomically (this reduces the lines of code and makes the calls more
> consistant).
> *Stripped out non threaded Trove code.
>
Cool.

> * Enhanced request scheduler debugging
> Added a new function to pvfs2-internal.h server_op_to_str, which  
> allows to
> fancy output the name of the op. The function is implemented in
> PINT-reqproto-encode.c, maybe this is not the right place for the  
> function ?

> This is used in the request scheduler, which now prints the whole  
> queue on
> each enqueue op with current states of all ops for that particular  
> handle
> (e.g. serviced or queued).

We've already got all the names of the operations in pvfs2-server.c  
(see the PINT_server_req_table).  I'd like to see that reused if  
possible.  Adding another array of all the operations means one more  
place that will have to be modified when new operations are added.

> * performance debug:
> I want to add a performance debugging option which allows to print  
> a couple of
> interesting metrics on the server side which might be analysed post  
> mortem.
> (e.g. number of elements in the trove queues etc.)
> For example one might run a skript to determine the time  
> distribution of sync
> requests or I/O requests. I haven't added other logic yet but will,  
> soon.
>

I'd prefer to see this as part of the event handling code, instead of  
as debug messages.  If the event handling code doesn't have the  
capabilities you need, we should fix it so it does.

> * replaced fsync with fdatasync,
> this might help on some journaling filesystems to improve  
> throughput because
> metadata is not forced to be written.

Good idea.  We should probably have a check for  
_POSIX_SYNCHRONIZED_IO to make sure its there though.

-sam

>
> enjoy,
> Julian
> <patchTrove.patch>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers



More information about the Pvfs2-developers mailing list