[Pvfs2-developers] protocol encoding

Sam Lang slang at mcs.anl.gov
Fri Feb 1 11:34:13 EST 2008


On Feb 1, 2008, at 10:02 AM, Walter B. Ligon III wrote:

> Murali, Sam, thanks, that explains the macro naming!
>
> I figured that the skip was for aligning on 64 bit, I'm still not  
> exactly sure where it is used.  Do we actually encode a pointer when  
> we have an array?

Each element of the array is encoded into the buffer separately.  Not  
sure that answers your question though.  We don't ever encode pointers  
of course, but we do sometimes use pointers to determine offsets into  
structures.  encode_PINT_Request is a good example of that.  We  
linearize the request by taking the pointers into the request and  
replacing them with offsets, and then encode those offsets.  BTW,  
didn't you write most of that code? ;-)

But I think you're right, the skip4 is used for 64 bit alignment.  We  
decode our basic types by just casting the buffer location to the type  
-- so all of our decoded fields point into the buffer directly.  The  
performance cost of multiple memory accesses, TLB misses, and possible  
non-atomic loads and stores makes the skip4 worth the trouble.

-sam


>
>
> Walt
>
> Sam Lang wrote:
>> On Jan 30, 2008, at 11:39 AM, Murali Vilayannur wrote:
>>> Hi Walt,
>>>> In src/proto/pvfs2-req-proto.h when you define a new request you  
>>>> create
>>>> a struct and then use a macro to create an encoding function for  
>>>> the
>>>> struct (endecode_fields_X_struct).  Sometimes, in the args to those
>>>> macros you insert a skip4,, which I gather is used to align  
>>>> something.
>>>> Can someone explain the rules for when and where you place this?
>>>
>>> ia64 and/or x86_64 likes pointers aligned on 8 byte boundaries.
>>> Unaligned access to memory via a pointer can cause one or more of  
>>> the
>>> following I think
>>> - slower performance
>>> - segmentation faults
>>> skip4 is needed only when pointers are involved as far as I
>>> understand. You don't need those when accessing scalar types.
>>>
>>>>
>>>> There is also some confusion as to the naming of those macros, in  
>>>> that
>>>> some of them seem to count the skip4,, and some don't.  In  
>>>> particular,
>>>> if there are 3 scalar arguments, but we need one skip, we use the
>>>> endecode_fields_4_struct macro - so we DO count the skip (3 args  
>>>> + skip
>>>> = 4) but if there is an array, say 3 scalars plus an array, we  
>>>> use the
>>>> endecode_fields_3a_struct macro - so we DO NOT count the skip.   
>>>> Some
>>>> array macros have one, some have two skips.  Any words of wisdom,  
>>>> or do
>>>> we just have to look it up in the code?
>>>
>>> yeah.. I think that needs to be fixed. If I am not mistaken, the
>>> number in the naming convention must include the skip4 also.
>>> As regards to why some array macros have one and some have two skips
>>> is because the former embeds only 1 pointer while the latter embeds
>>> two sets of arrays along with the count.
>>> Since the count is typically 4 bytes, we insert a skip4 and then  
>>> drop
>>> the array after that.
>> Murali, I agree with your explanation, but its not clear what you  
>> think needs to be fixed.  The macros match the naming convention  
>> you described.
>> Maybe part of the confusion that Walt&Co have is that the an array  
>> macro (3a_struct) always includes an extra uint32_t for the length  
>> of the array, but that field is not counted in the number used in  
>> the name of the macro.  For example:
>> endecode_fields_3a_struct(
>>    PVFS_servresp_readdir,
>>    PVFS_ds_keyval, token,
>>    uint64_t, directory_version,
>>    skip4,,
>>    uint32_t, dirent_count,
>>    PVFS_dirent, dirent_array)
>> There are 3 fields and a struct here.  The 3 fields are token,  
>> directory_version, and the skip4.  The array is the count and the  
>> dirent_array.  The skip is needed because of the 32 bit count  
>> field, but in some cases it wouldn't be:
>> endecode_fields_2a_struct(
>>    my_op,
>>    uint64_t, var1,
>>    uint32_t, var2,
>>    uint32_t, array_count,
>>    uint32_t, array)
>> You might argue that we should always just pad the array count, or  
>> use a 64 bit value for it, but I don't think Pete wanted to waste  
>> bytes in the request unless necessary.  Its hard to quibble about 4  
>> bytes, but that design focus does help keep request messages under  
>> the eager message sizes of our protocols.
>> -sam
>>>
>>>
>>>> For those who are interested, the first thing we are working on is
>>>> Phil's server-to-server enabled file create.  In the first step  
>>>> we are
>>>> migrating the client create syscall functionality to the server,  
>>>> then we
>>>> will work on implementing collective communication.  Right now we  
>>>> are
>>>> trying to figure out to what extent we can use the new state  
>>>> machine
>>>> features to simplify that by essentially starting a client state  
>>>> machine
>>>> on the server.  Any input on that activity is encouraged.
>>> Awesome!
>>> Thanks,
>>> Murali
>>>>
>>>> Walt
>>>> -- 
>>>> Dr. Walter B. Ligon III
>>>> Associate Professor
>>>> ECE Department
>>>> Clemson University
>>>> _______________________________________________
>>>> Pvfs2-developers mailing list
>>>> Pvfs2-developers at beowulf-underground.org
>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2- 
>>>> developers
>>>>
>>> _______________________________________________
>>> Pvfs2-developers mailing list
>>> Pvfs2-developers at beowulf-underground.org
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>>
>
> -- 
> Dr. Walter B. Ligon III
> Associate Professor
> ECE Department
> Clemson University
>



More information about the Pvfs2-developers mailing list