[Pvfs2-developers] PVFS-hint

Sam Lang slang at mcs.anl.gov
Fri Oct 6 13:19:49 EDT 2006


On Oct 6, 2006, at 2:01 AM, Julian Martin Kunkel wrote:

> Hi,
>> This way callers wouldn't be able to muck with the internals of the
>> hint struct.
> Ok, I will definitly do this.
>> As I said, I prefer letting the hint struct be defined
>> externally and requiring an array of them to the system interfaces.
>> It seems to match what we have throughout the rest of PVFS.
> If the pvfs2-team thinks that is the way to go I will do so, even  
> if I prefer
> the list for the system interface. So I think the array wins ?
>
>> Setting specific IO servers for a file is interesting for research
>> purposes, but kind of breaks the encapsulation and abstraction that
>> pvfs tries to provide.  We've already got a distribution parameter
>> that we pass in to PVFS_sys_create when appropriate.
>> I don't know that we want to go around telling users to set an
>> environment variable (a separate interface, remember) for specific
>> behavior that they've requested.  I'd rather figure out why the dist
>> parameter isn't providing the bits of functionality that they need,
>> and solve their problem from that angle.
>> I vaguely remember having
>> this discussion about specifying datafile handles through the dist
>> parameter, and there being some concerns, but I can't remember the
>> details.  Can you remind me why that doesn't work?
> The dist parameter is opaque for the client interface and should  
> only be
> interpreted from the distribution. However, on the layer of the  
> distribution
> the hostnames are not available, also it would be available only to
> distributions which interprete the given hostnames.

The specific dist implementation is certainly opaque from the client  
interface, but we do provide functions to set parameters on a  
particular distribution.  What I would envision is something along  
these lines:

struct PINT_dist_server_settable_indices {

	int count;
	int * servers;
};

...

PVFS_sys_dist mydist;
struct PINT_dist_server_settable_indices indices =
{
	4,
	{2, 4, 6, 8}
}

...

mydist.name = "server-settable-dist";

PVFS_sys_dist_setparam(&mydist, "servers", &indices);

....

PVFS_sys_create(..., &mydist, ...);

The server-settable-dist would be implemented to store the indices  
for the IO servers in the params field of the PVFS_sys_dist  
structure.  I used server indices, because the PINT_dist_* interfaces  
allow for that (through the PINT_request_file_data struct), but its a  
bit ugly and probably confusing to the user.  We could change that  
though, and use something like server aliases or hostnames, passing  
them through to the distribution parameters instead of indices.  This  
would require a change to the distribution method calls, and the

> Also the create sm has to
> be modified a lot to use the new distribution facilities.

It certainly would.  The client create state machine is a little  
broken in this respect anyway though.  It normally just gets the list  
of IO servers from the cached server config file and does a create  
request to all of them.  There's special case handling right now for  
the directory hints on the parent to see if the number of datafiles  
handles should be fewer than available.
It seems like that could be generalized to get a list of server  
indices from the distribution with a distribution method that returns  
that info.  We already store a distribution in the directory hints  
anyway, so we could probably just throw away that dfile_count  
parameter.  In any case, I would be in favor of a change to the  
create sm that allows the distribution to optionally specify the  
server indices, and fallback to the cached config lookup otherwise.


> Of course by adding
> another array to the distribution struct which is common to all  
> distributions
> would solve the issue but I would not prefer this solution.
>
No that's gross.

> With the hint it is a common infrastructure given for all available
> distributions and it also is a uncommon case to set the distribution..

Right, I don't like the idea of 'overriding' a distribution's  
behavior with a hint.  Conceptually it seems like the distribution  
should take care of this, and if it doesn't, we should fix it so it  
does.
Also, the server list hint gets a little ugly across sysint calls,  
since the list doesn't actually get stored anywhere for that file  
(does it?).  With the distribution, we only have to specify it once  
at creation time, instead of passing it along to all future IO calls  
on that file.

-sam

> Wasn't
> the hint be intented to be useful for research purposes ?
>
> Thanks for the discussion and the time you spend,
> Julian
>
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list