[Pvfs2-developers] PVFS-hint
Sam Lang
slang at mcs.anl.gov
Fri Oct 6 14:41:06 EDT 2006
On Oct 6, 2006, at 12:19 PM, Sam Lang wrote:
>
> On Oct 6, 2006, at 2:01 AM, Julian Martin Kunkel wrote:
>
>> Hi,
>>> This way callers wouldn't be able to muck with the internals of the
>>> hint struct.
>> Ok, I will definitly do this.
>>> As I said, I prefer letting the hint struct be defined
>>> externally and requiring an array of them to the system interfaces.
>>> It seems to match what we have throughout the rest of PVFS.
>> If the pvfs2-team thinks that is the way to go I will do so, even
>> if I prefer
>> the list for the system interface. So I think the array wins ?
>>
>>> Setting specific IO servers for a file is interesting for research
>>> purposes, but kind of breaks the encapsulation and abstraction that
>>> pvfs tries to provide. We've already got a distribution parameter
>>> that we pass in to PVFS_sys_create when appropriate.
>>> I don't know that we want to go around telling users to set an
>>> environment variable (a separate interface, remember) for specific
>>> behavior that they've requested. I'd rather figure out why the dist
>>> parameter isn't providing the bits of functionality that they need,
>>> and solve their problem from that angle.
>>> I vaguely remember having
>>> this discussion about specifying datafile handles through the dist
>>> parameter, and there being some concerns, but I can't remember the
>>> details. Can you remind me why that doesn't work?
>> The dist parameter is opaque for the client interface and should
>> only be
>> interpreted from the distribution. However, on the layer of the
>> distribution
>> the hostnames are not available, also it would be available only to
>> distributions which interprete the given hostnames.
>
> The specific dist implementation is certainly opaque from the
> client interface, but we do provide functions to set parameters on
> a particular distribution. What I would envision is something
> along these lines:
>
> struct PINT_dist_server_settable_indices {
>
> int count;
> int * servers;
> };
>
> ...
>
> PVFS_sys_dist mydist;
> struct PINT_dist_server_settable_indices indices =
> {
> 4,
> {2, 4, 6, 8}
> }
>
> ...
>
> mydist.name = "server-settable-dist";
>
> PVFS_sys_dist_setparam(&mydist, "servers", &indices);
>
> ....
>
> PVFS_sys_create(..., &mydist, ...);
>
> The server-settable-dist would be implemented to store the indices
> for the IO servers in the params field of the PVFS_sys_dist
> structure. I used server indices, because the PINT_dist_*
> interfaces allow for that (through the PINT_request_file_data
> struct), but its a bit ugly and probably confusing to the user. We
> could change that though, and use something like server aliases or
> hostnames, passing them through to the distribution parameters
> instead of indices. This would require a change to the
> distribution method calls, and the
>
>> Also the create sm has to
>> be modified a lot to use the new distribution facilities.
>
> It certainly would. The client create state machine is a little
> broken in this respect anyway though. It normally just gets the
> list of IO servers from the cached server config file and does a
> create request to all of them. There's special case handling right
> now for the directory hints on the parent to see if the number of
> datafiles handles should be fewer than available.
> It seems like that could be generalized to get a list of server
> indices from the distribution with a distribution method that
> returns that info. We already store a distribution in the
> directory hints anyway, so we could probably just throw away that
> dfile_count parameter. In any case, I would be in favor of a
> change to the create sm that allows the distribution to optionally
> specify the server indices, and fallback to the cached config
> lookup otherwise.
>
>
>> Of course by adding
>> another array to the distribution struct which is common to all
>> distributions
>> would solve the issue but I would not prefer this solution.
>>
> No that's gross.
>
>> With the hint it is a common infrastructure given for all available
>> distributions and it also is a uncommon case to set the
>> distribution..
>
> Right, I don't like the idea of 'overriding' a distribution's
> behavior with a hint. Conceptually it seems like the distribution
> should take care of this, and if it doesn't, we should fix it so it
> does.
> Also, the server list hint gets a little ugly across sysint calls,
> since the list doesn't actually get stored anywhere for that file
> (does it?). With the distribution, we only have to specify it once
> at creation time, instead of passing it along to all future IO
> calls on that file.
Actually not sure what I was thinking there. The datafile handles of
course get stored in the dh keyval.
-sam
>
> -sam
>
>> Wasn't
>> the hint be intented to be useful for research purposes ?
>>
>> Thanks for the discussion and the time you spend,
>> Julian
>>
>> _______________________________________________
>> Pvfs2-developers mailing list
>> Pvfs2-developers at beowulf-underground.org
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list