[Pvfs2-developers] Trove patches

Julian Martin Kunkel Julian.Kunkel at web.de
Tue Jul 18 17:00:40 EDT 2006


Hi,
enclosed you will find patches for the following issues:

Major changes:
* sync-coalesce:
If the last couple of operations which need to be synced finish with an error  
then the other operations can be stalled, too due to the handling by the 
request scheduler. (I have found this bug a couple of weeks ago but did not 
know what caused the race condition). It took me quite a while to track that 
bug...
The policy right now ensures that error requests never are enqueued. 
(Note: should we flush the db in cases of an error ?)
*deleting in the background
Files are during the deletion renamed an deleted in the background to shorten 
response time of the server in case the file is very big...
Note: people have to put the storage dir into one filesystem. 
* Trove multique support:
Now there is a thread for metadata ro, metadata rw, I/O and for deleting in 
the background.
This patch improves the throughput for read only ops, while write ops happen. 
Usually read ops are expected to be cached effectively, while all write ops 
force disk operations.
* Trove queues are changed to support an internal number of queued elems.
Also I removed a few functions, added more prefixes with _nolock.
The make move_op_to_completion_queue is removed also some other functions  are  
replaced with dbpf_move_op_to_completion_queue and dbpf_op_pop_front_nolock.
* dbpf_dspace_cancel modified to guarantee that operation is not finished 
right now, also change to use id_safe_gen instead of fast_gen. This changes 
make the interface more useful and remove some dependencies on the usage from 
the upper layers. Also test, testsome changed that they can be called on the 
same time without possible memleaks / segfaults.

Minor changes:
* added define for mkdir syscall in dbpf.h
* added dbpf_op_get_status (and set status) to change status of return value 
atomically (this reduces the lines of code and makes the calls more 
consistant).
*Stripped out non threaded Trove code.
* Enhanced request scheduler debugging
Added a new function to pvfs2-internal.h server_op_to_str, which allows to 
fancy output the name of the op. The function is implemented in 
PINT-reqproto-encode.c, maybe this is not the right place for the function ? 
This is used in the request scheduler, which now prints the whole queue on 
each enqueue op with current states of all ops for that particular handle 
(e.g. serviced or queued). 
* performance debug:
I want to add a performance debugging option which allows to print a couple of 
interesting metrics on the server side which might be analysed post mortem.
(e.g. number of elements in the trove queues etc.)
For example one might run a skript to determine the time distribution of sync 
requests or I/O requests. I haven't added other logic yet but will, soon.
* replaced fsync with fdatasync,
this might help on some journaling filesystems to improve throughput because 
metadata is not forced to be written.

enjoy,
Julian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patchTrove.patch
Type: text/x-diff
Size: 86118 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20060718/575ee14e/patchTrove-0001.bin


More information about the Pvfs2-developers mailing list