[PVFS-users] PVFS mechanics

Rob Ross rross at mcs.anl.gov
Sat Dec 4 11:57:28 EST 2004


Hi Brian,

Clients talk to the metadata server (mgr) at open, close, and for metadata
operations.  The metadata server also talks with the I/O servers (iods) at 
various points, particularly when a file is opened or closed, and when 
someone asks for the file size of a file that is currently opened by 
someone.

You're tcpdump'ing a regular UNIX cp?

So, your scenario goes something like this:

0. Nothing happens until a client requests either a metadata or I/O 
   operation to a PVFS file or directory, unless you're looking at the 
   mount operation too.

1. Client interacts with metadata server to read directory, if for no 
   other reason than to see if the file already exists.

2. Metadata server replies with directory contents as necessary.

3. Client opens destination file in PVFS directory.

4. Metadata server will notify I/O servers that the particular client is 
   opening the file with particular permissions, and it will return a 
   "capability" (just an integer really) to the client indicating an 
   instance of the open file.  The acknowledgement for the metadata server 
   also includes a bit of data allowing the client to map file pieces to 
   servers.

5. Client interacts directly with I/O servers, sending first a 
   request to write and then chunks of data to servers as quickly as
   network and TCP buffers allow.

   I/O servers respond with an acknowledgement at the end of the write, 
   indicating that they did in fact get all the data.

6. Client interacts with metadata server to close the file.  Client also 
   sends messages to I/O servers indicating that it is closing the file.

   If the file is no longer open by anyone, the metadata
   server will contact I/O servers, let them know that the file has 
   been closed, and get an updated modification time and such.  Likewise, 
   if you have the "fast-stats" option set, then the metadata server will 
   perform some extra communication to get and store the new file size.

I think that's a pretty accurate representation of exactly what happens.  
If you're actually looking at "cp", there are probably some extra 
interactions with the metadata server that are stat() calls; you could 
strace cp to find out what it is doing.

Regards,

Rob

On Sat, 4 Dec 2004, Brian Jones wrote:

> Hi all, 
> 
> I'm actually just trying to make sure I really understand how PVFS
> really *works*, because the documentation (though there's plenty of
> it) is rather vague with regard to the conversation that takes place
> between the players involved. I'm writing an article on PVFS, partly as
> an exercise to make sure I understand it myself, and want to make sure I
> have all my ducks in a row. I'm running pvfs as supplied with ROCKS
> 3.3.0 on dual 933 PIII nodes. 
> 
> Best I can tell from doing lots of tcpdumps, when a pvfs client wants to
> copy a file to a pvfs directory, the following things happen:
> 
> 1. The client contacts the metadata server, presumably to request
> service, possibly to say it wants to either read or write, but for all I
> know it may just be a ping. Clarification?
> 
> 2. The metadata server responds and says something akin to "whaddya
> got?"
> 
> 3. The client tells the metadata server that it wants to copy a file in,
> of a particular size, having particular ownership attributes, etc.
> 
> 4. The metadata server records this information, and tells the client
> where to send the actual data chunks that make up this file. 
> 
> 5. The client sends the data chunks off to the designated IODs, who send
> back an "ack" saying they got the data. 
> 
> 6. The client sends some form of confirmation of these events to the
> metadata server, so it knows for sure where everything is in the event
> that the file later needs to be accessed/deleted/whatever. 
> 
> Of course, this is all from the client end of the conversation. If
> anyone wants to add/correct/clarify anything, PLEASE do so. 
> 
> Thanks,
> brian.


More information about the PVFS-users mailing list