[PVFS-users] PVFS mechanics
Rob Ross
rross at mcs.anl.gov
Sat Dec 4 11:57:28 EST 2004
Hi Brian,
Clients talk to the metadata server (mgr) at open, close, and for metadata
operations. The metadata server also talks with the I/O servers (iods) at
various points, particularly when a file is opened or closed, and when
someone asks for the file size of a file that is currently opened by
someone.
You're tcpdump'ing a regular UNIX cp?
So, your scenario goes something like this:
0. Nothing happens until a client requests either a metadata or I/O
operation to a PVFS file or directory, unless you're looking at the
mount operation too.
1. Client interacts with metadata server to read directory, if for no
other reason than to see if the file already exists.
2. Metadata server replies with directory contents as necessary.
3. Client opens destination file in PVFS directory.
4. Metadata server will notify I/O servers that the particular client is
opening the file with particular permissions, and it will return a
"capability" (just an integer really) to the client indicating an
instance of the open file. The acknowledgement for the metadata server
also includes a bit of data allowing the client to map file pieces to
servers.
5. Client interacts directly with I/O servers, sending first a
request to write and then chunks of data to servers as quickly as
network and TCP buffers allow.
I/O servers respond with an acknowledgement at the end of the write,
indicating that they did in fact get all the data.
6. Client interacts with metadata server to close the file. Client also
sends messages to I/O servers indicating that it is closing the file.
If the file is no longer open by anyone, the metadata
server will contact I/O servers, let them know that the file has
been closed, and get an updated modification time and such. Likewise,
if you have the "fast-stats" option set, then the metadata server will
perform some extra communication to get and store the new file size.
I think that's a pretty accurate representation of exactly what happens.
If you're actually looking at "cp", there are probably some extra
interactions with the metadata server that are stat() calls; you could
strace cp to find out what it is doing.
Regards,
Rob
On Sat, 4 Dec 2004, Brian Jones wrote:
> Hi all,
>
> I'm actually just trying to make sure I really understand how PVFS
> really *works*, because the documentation (though there's plenty of
> it) is rather vague with regard to the conversation that takes place
> between the players involved. I'm writing an article on PVFS, partly as
> an exercise to make sure I understand it myself, and want to make sure I
> have all my ducks in a row. I'm running pvfs as supplied with ROCKS
> 3.3.0 on dual 933 PIII nodes.
>
> Best I can tell from doing lots of tcpdumps, when a pvfs client wants to
> copy a file to a pvfs directory, the following things happen:
>
> 1. The client contacts the metadata server, presumably to request
> service, possibly to say it wants to either read or write, but for all I
> know it may just be a ping. Clarification?
>
> 2. The metadata server responds and says something akin to "whaddya
> got?"
>
> 3. The client tells the metadata server that it wants to copy a file in,
> of a particular size, having particular ownership attributes, etc.
>
> 4. The metadata server records this information, and tells the client
> where to send the actual data chunks that make up this file.
>
> 5. The client sends the data chunks off to the designated IODs, who send
> back an "ack" saying they got the data.
>
> 6. The client sends some form of confirmation of these events to the
> metadata server, so it knows for sure where everything is in the event
> that the file later needs to be accessed/deleted/whatever.
>
> Of course, this is all from the client end of the conversation. If
> anyone wants to add/correct/clarify anything, PLEASE do so.
>
> Thanks,
> brian.
More information about the PVFS-users
mailing list