[Pvfs2-developers] PVFS-hint

Sam Lang slang at mcs.anl.gov
Fri Oct 6 14:41:06 EDT 2006


On Oct 6, 2006, at 12:19 PM, Sam Lang wrote:

>
> On Oct 6, 2006, at 2:01 AM, Julian Martin Kunkel wrote:
>
>> Hi,
>>> This way callers wouldn't be able to muck with the internals of the
>>> hint struct.
>> Ok, I will definitly do this.
>>> As I said, I prefer letting the hint struct be defined
>>> externally and requiring an array of them to the system interfaces.
>>> It seems to match what we have throughout the rest of PVFS.
>> If the pvfs2-team thinks that is the way to go I will do so, even  
>> if I prefer
>> the list for the system interface. So I think the array wins ?
>>
>>> Setting specific IO servers for a file is interesting for research
>>> purposes, but kind of breaks the encapsulation and abstraction that
>>> pvfs tries to provide.  We've already got a distribution parameter
>>> that we pass in to PVFS_sys_create when appropriate.
>>> I don't know that we want to go around telling users to set an
>>> environment variable (a separate interface, remember) for specific
>>> behavior that they've requested.  I'd rather figure out why the dist
>>> parameter isn't providing the bits of functionality that they need,
>>> and solve their problem from that angle.
>>> I vaguely remember having
>>> this discussion about specifying datafile handles through the dist
>>> parameter, and there being some concerns, but I can't remember the
>>> details.  Can you remind me why that doesn't work?
>> The dist parameter is opaque for the client interface and should  
>> only be
>> interpreted from the distribution. However, on the layer of the  
>> distribution
>> the hostnames are not available, also it would be available only to
>> distributions which interprete the given hostnames.
>
> The specific dist implementation is certainly opaque from the  
> client interface, but we do provide functions to set parameters on  
> a particular distribution.  What I would envision is something  
> along these lines:
>
> struct PINT_dist_server_settable_indices {
>
> 	int count;
> 	int * servers;
> };
>
> ...
>
> PVFS_sys_dist mydist;
> struct PINT_dist_server_settable_indices indices =
> {
> 	4,
> 	{2, 4, 6, 8}
> }
>
> ...
>
> mydist.name = "server-settable-dist";
>
> PVFS_sys_dist_setparam(&mydist, "servers", &indices);
>
> ....
>
> PVFS_sys_create(..., &mydist, ...);
>
> The server-settable-dist would be implemented to store the indices  
> for the IO servers in the params field of the PVFS_sys_dist  
> structure.  I used server indices, because the PINT_dist_*  
> interfaces allow for that (through the PINT_request_file_data  
> struct), but its a bit ugly and probably confusing to the user.  We  
> could change that though, and use something like server aliases or  
> hostnames, passing them through to the distribution parameters  
> instead of indices.  This would require a change to the  
> distribution method calls, and the
>
>> Also the create sm has to
>> be modified a lot to use the new distribution facilities.
>
> It certainly would.  The client create state machine is a little  
> broken in this respect anyway though.  It normally just gets the  
> list of IO servers from the cached server config file and does a  
> create request to all of them.  There's special case handling right  
> now for the directory hints on the parent to see if the number of  
> datafiles handles should be fewer than available.
> It seems like that could be generalized to get a list of server  
> indices from the distribution with a distribution method that  
> returns that info.  We already store a distribution in the  
> directory hints anyway, so we could probably just throw away that  
> dfile_count parameter.  In any case, I would be in favor of a  
> change to the create sm that allows the distribution to optionally  
> specify the server indices, and fallback to the cached config  
> lookup otherwise.
>
>
>> Of course by adding
>> another array to the distribution struct which is common to all  
>> distributions
>> would solve the issue but I would not prefer this solution.
>>
> No that's gross.
>
>> With the hint it is a common infrastructure given for all available
>> distributions and it also is a uncommon case to set the  
>> distribution..
>
> Right, I don't like the idea of 'overriding' a distribution's  
> behavior with a hint.  Conceptually it seems like the distribution  
> should take care of this, and if it doesn't, we should fix it so it  
> does.
> Also, the server list hint gets a little ugly across sysint calls,  
> since the list doesn't actually get stored anywhere for that file  
> (does it?).  With the distribution, we only have to specify it once  
> at creation time, instead of passing it along to all future IO  
> calls on that file.

Actually not sure what I was thinking there.  The datafile handles of  
course get stored in the dh keyval.

-sam

>
> -sam
>
>> Wasn't
>> the hint be intented to be useful for research purposes ?
>>
>> Thanks for the discussion and the time you spend,
>> Julian
>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list