[Pvfs2-developers] Distribution by hostname
Sam Lang
slang at mcs.anl.gov
Sun Oct 8 13:58:15 EDT 2006
On Oct 6, 2006, at 5:59 PM, Pete Wyckoff wrote:
> slang at mcs.anl.gov wrote on Fri, 06 Oct 2006 16:33 -0500:
>> On Oct 6, 2006, at 1:48 PM, Julian Martin Kunkel wrote:
>>> Also it will not
>>> allow to set the servers for all distributions...
>>
>> Yeah I can't imagine wanting to ever do that. It would mean passing
>> in a distribution different from the default simple-stripe, as well
>> as a hint saying you want a specific set of servers in the same
>> call. Seems sort of yucky to me. I'd rather have all the
>> information about the distribution in the distribution. You're even
>> able to use the distribution field in the directory hints structure
>> to specify per-directory IO server lists. Not that you would ever
>> want to do that either...
>
> I agree with Sam that this is yucky. I'm hijacking this thread.
>
> Let's forget about hints for a moment
I think everyone is already aware of this, but just to clarify terms,
I mentioned the 'directory hints' previously, but these aren't the
same hints that Julian has added to his hints branch. Murali added
these extended attributes:
user.pvfs2.dist_name
user.pvfs2.dist_params
user.pvfs2.num_dfiles
user.pvfs2.meta_hint
These are used to modify the behavior of files created in that
directory. In terms of functionality, if we were to provide a
distribution that allows us to enumerate the actual servers, or just
specify a number of servers (and let them be chosen randomly), then
the num_dfiles eattr becomes redundant. This is a case though where
one might want to be able to store the IO servers list in the
distribution (or again just a count of them).
> and decide how we want to
> extend the concept of distributions, as seen by users, in such a way
> that they can specify particular IO servers by name. If this is an
> interface people want, we should design it properly, not just
> implement it with hints because we (might) have them.
>
> Some issues, please suggest approaches and other issues. (I'm using
> "name" here to mean host alias.)
>
> 1. What kind of control do users want?
>
> - all data on one server by name?
> - arbirtrary control of stripe sizes and host names?
>
> 2. New distribution name, or extension to existing ones?
>
> - dist-varstrip has a lot of flexibility, but no hostnames
> - maybe a new "dist-single-host-by-name" is all that is desired
>
> 3. Store hostnames in on-disk distribution?
>
> - guessing no for the single-stripe distro, but perhaps somebody
> can really think of a use case for this?
>
> 4. User API
>
> - through PVFS_dist_create
> - (please not through both PVFS_dist_create + some hint)
> - via environment variable too?
>
I would argue that environment variables are messy for this. We
already have a precedent for using extended attributes to group the
way new files get distributed, and I would imagine eattrs give you
some interesting capabilities when it comes to doing migration.
> If our design happens to end up as something that would be
> implemented well by hints, then we can think about using them. For
> now, let's just get the design correct.
>
> We can come back and argue the merits of a generic hint interface in
> a different thread.
>
> -- Pete
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list