[Pvfs2-developers] crdirent

Rob Ross rross at mcs.anl.gov
Mon Jun 12 23:07:31 EDT 2006


hey,

i know we're trying to keep the # of DBs down, but would it really hurt 
that much to just use a separate DB for this data rather than having to 
play funny games with the key strings?

also, it seems a little wacky that we have to pass a flag to tell trove 
when to count and when not to count. is there a clean way to avoid that?

how do you read the count?

otherwise i think it's great that we're moving the count 
increment/decrement into trove, that this will allow for concurrent 
modification, and that we can simplify the state machines.

thanks!

rob

Sam Lang wrote:
> 
> Hi all,
> 
> The new keyval code currently stores the size of a directory as a 
> separate common keyval.  The server state machines update this value 
> with get/set state actions as needed (in crdirent,rmdirent,etc.).  This 
> get and set actually prevents us from allowing the create and delete 
> operations of different files in the same directory to take place 
> concurrently, since the crdirent and rmdirent ops (on the parent dirdata 
> handle) get serialized.
> 
> I'd like to fix all this by providing a keyval per handle that contains 
> a null string as part of the key (I call it keyval-handle-info).  The 
> advantage of making it the null string is that it will appear first in 
> the lexical ordering of directory entries, so I can skip over it in 
> readdir easily.  This null keyval would only be created on handles as 
> necessary (right now only for counting dirents).  The  
> TROVE_KEYVAL_HANDLE_COUNT ds flag can be passed to trove operations, for 
> example in the case of crdirent, the TROVE_KEYVAL_HANDLE_COUNT and 
> TROVE_NOOVERWITE flags would be passed to the trove_keyval_write call 
> and specify that the count should be incremented (or created and set to 
> 0 if it doesn't exist).  rmdirent would do something similar in 
> trove_keyval_remove.
> 
> Also, at present the crdirent and rmdirent state machines first do a 
> read of the keyval to check for existence.  This seems unnecessary.  
> Instead, the crdirent sm can just pass TROVE_NOOVERWITE to the 
> keyval_write call, and fail if that call fails.  rmdirent already fails 
> if the keyval_remove fails so the extra keyval_read to check for 
> existence seems redundant.  Are there any good reasons for those extra 
> state actions that I'm missing?
> 
> I've attached a patch of the changes I've described.  I would like to 
> have this go in to the trunk before the upcoming release, since it 
> requires (yet another) storage format change.  Let me know if there are 
> any questions or concerns.


More information about the Pvfs2-developers mailing list