[Pvfs2-developers] Distribution by hostname

Sam Lang slang at mcs.anl.gov
Sun Oct 8 13:58:15 EDT 2006


On Oct 6, 2006, at 5:59 PM, Pete Wyckoff wrote:

> slang at mcs.anl.gov wrote on Fri, 06 Oct 2006 16:33 -0500:
>> On Oct 6, 2006, at 1:48 PM, Julian Martin Kunkel wrote:
>>> Also it will not
>>> allow to set the servers for all distributions...
>>
>> Yeah I can't imagine wanting to ever do that.  It would mean passing
>> in a distribution different from the default simple-stripe, as well
>> as a hint saying you want a specific set of servers in the same
>> call.  Seems sort of yucky to me.  I'd rather have all the
>> information about the distribution in the distribution.  You're even
>> able to use the distribution field in the directory hints structure
>> to specify per-directory IO server lists.  Not that you would ever
>> want to do that either...
>
> I agree with Sam that this is yucky.  I'm hijacking this thread.
>
> Let's forget about hints for a moment

I think everyone is already aware of this, but just to clarify terms,  
I mentioned the 'directory hints' previously, but these aren't the  
same hints that Julian has added to his hints branch.  Murali added  
these extended attributes:

user.pvfs2.dist_name
user.pvfs2.dist_params
user.pvfs2.num_dfiles
user.pvfs2.meta_hint

These are used to modify the behavior of files created in that  
directory.  In terms of functionality, if we were to provide a  
distribution that allows us to enumerate the actual servers, or just  
specify a number of servers (and let them be chosen randomly), then  
the num_dfiles eattr becomes redundant.  This is a case though where  
one might want to be able to store the IO servers list in the  
distribution (or again just a count of them).

> and decide how we want to
> extend the concept of distributions, as seen by users, in such a way
> that they can specify particular IO servers by name.  If this is an
> interface people want, we should design it properly, not just
> implement it with hints because we (might) have them.
>
> Some issues, please suggest approaches and other issues.  (I'm using
> "name" here to mean host alias.)
>
> 1.  What kind of control do users want?
>
>     - all data on one server by name?
>     - arbirtrary control of stripe sizes and host names?
>
> 2.  New distribution name, or extension to existing ones?
>
>     - dist-varstrip has a lot of flexibility, but no hostnames
>     - maybe a new "dist-single-host-by-name" is all that is desired
>
> 3.  Store hostnames in on-disk distribution?
>
>     - guessing no for the single-stripe distro, but perhaps somebody
>     can really think of a use case for this?
>
> 4.  User API
>
>     - through PVFS_dist_create
>     - (please not through both PVFS_dist_create + some hint)
>     - via environment variable too?
>

I would argue that environment variables are messy for this.  We  
already have a precedent for using extended attributes to group the  
way new files get distributed, and I would imagine eattrs give you  
some interesting capabilities when it comes to doing migration.

> If our design happens to end up as something that would be
> implemented well by hints, then we can think about using them.  For
> now, let's just get the design correct.
>
> We can come back and argue the merits of a generic hint interface in
> a different thread.
>
> 		-- Pete
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>



More information about the Pvfs2-developers mailing list