[PVFS-developers] faked /etc/mtab entry

James MacKinnon jmack@Phys.UAlberta.CA
Sun, 22 Oct 2000 16:13:17 -0600


On the TPC/UDP front:

Looks like you may be at least partly right Rob :-)

PVFS can "sort-of-work" across firewalls bewteen clusters at least in
read-only mode. Write operations however make things hang....

I have PVFS iod's operating on firewalled nodes on THOR, one of our
clusters. thor-gw is the front-end gateway for it and runs the pvfs mgr.
On a second cluster, CM,  I loaded up the kernel module and deamon on the
nodes behind that gateway, and viola ( h1 is a node behind cm-gw, the
second cluster front-end host). h1 is acting merely as a client here:


[root@h1 /]# ls -l /pvfs/bench/
total 6240300
-rwxr-xr-x   1 root     root        22920 Oct 22 14:24 bonnie
-rwxr-xr-x   1 root     root     2097152000 Oct 22 15:01 bonnie.25585
-rw-r--r--   1 root     root        18421 Oct 22 13:40 bonnie.c
-rw-r--r--   1 root     root          135 Oct 22 14:24 bonnie.log
-rw-r--r--   1 root     root     2146435072 Oct 22 13:15 chunkfile.jnk0
-rw-r--r--   1 root     root     2146435072 Oct 22 13:29 chunkfile.jnk1
-rw-r--r--   1 root     root          167 Oct 22 13:20 tlog0
-rw-r--r--   1 root     root          167 Oct 22 13:35 tlog1
-rw-r--r--   1 root     root         1767 Oct 22 13:55 usage.log

The IPchains rules for both thor-gw and cm-gw allow forwarding/ip_masq
of TCP ports 3000 and 7050 (7050 is what I'm using for the IOD's)
 
Thus PVFS can operate similarly to AFS in this respect. Very cool!


But if I try a write to /pfvs from node h1, then things hang:

[root@h1 /pvfs]# touch jnk
touch: jnk: No such file or directory
[root@h1 /pvfs]# touch jnk
<Ctrl-C>

[root@h1 /]# ls /pvfs
<Ctrl-C>

no responses on /pvfs at all after that.

Stopping and restarting the daemon on node h1 gets me back the
readability, and it seems the new jnk file is there after that:

[root@h1 /]# ls -l /pvfs/
total 8
drwxr-xr-x   1 root     root         4096 Oct 22 15:17 bench
drwxr-xr-x   1 jmack    202          4096 Oct 22 13:43 jmack
-rw-r--r--   1 root     root            0 Oct 22 15:09 jnk

So this means that the write call went through, but something else in
the deamon code is hanging before completion and preventing normal
operations.

The syslog on on node h1 (behind cm-gw firewall; 129.128.7.236 is thor-gw )
shows an enqueue problem:

(ll_pvfs.c, 129): ll_pvfs_hint failed on enqueue for
		  129.128.7.236:3000/localdisk/pvfs_root/jnk
(ll_pvfs.c, 359): ll_pvfs_getmeta failed on enqueue for
		  129.128.7.236:3000/localdisk/pvfs_root
(ll_pvfs.c, 359): ll_pvfs_getmeta failed on enqueue for
		  129.128.7.236:3000/localdisk/pvfs_root
(ll_pvfs.c, 359): ll_pvfs_getmeta failed on enqueue for
		  129.128.7.236:3000/localdisk/pvfs_root
(ll_pvfs.c, 359): ll_pvfs_getmeta failed on enqueue for
		  129.128.7.236:3000/localdisk/pvfs_root
(pvfsdev.c, 334): pvfsdev: 1 upcall sequences removed, 1 of which had
		  become invalid.

(first one looks like it's for the call to create file 'jnk', ones
after that appear to be requests by h1 to list /pfvs )

Any ideas on this? Is there another backchannel port used in PVFS 
that needs IPchains treatment ? Maybe something to do with cwd
compatibility?

This "almost" functionality of PVFS across firewalls makes me want to
rethink the syntax for a PVFS /etc/mtab entry, in that it should contain a
domain identifier of some sort, ie., a slightly more traditional:

	thor-gw:PVFS 	/pvfs pvfs rw 0 0

so that multiple entries from different hosts can be added and show up
separately.


Cheers,
-Jim

On Sun, 22 Oct 2000, Robert Ross wrote:

> On Sun, 22 Oct 2000, James MacKinnon wrote:
> 
> > On Sun, 22 Oct 2000, Robert Ross wrote:
> > 
> > > I've just about gotten the df output working right (it's the statfs() call
> > > that has to work).  It will report the total size, the amount used, and an
> > > estimate of the amount available (min_free * nr_iods).
> > 
> > Great!
> >  
> > > It's nice to see that the mtab entry is about all that is holding us back
> > > on the "real mounting" (i.e. having an entry in /etc/fstab, using plain
> > > ole mount) front; perhaps we can get that wrapped up soon too.
> > 
> > /etc/fstab heuristics could be used, but it's mostly for NFS and local, so
> > it may not be entirely appropriate to have PVFS in there (too many other
> > dependencies and setup required before mount could reasonably deal with
> > it).
> 
> Hmm.  I see your point, but I think that having it in there with a
> "noauto" option would be a good way to make things more understandable for
> users.  You're right that it is necessary to get all the daemons up and
> running first, but it's not that much different from NFS.
> 
> > AFS does the /etc/mtab insertion in it's own code, since there is a lot of
> > setup to do (modules, deamons, etc). Perhaps PVFS could just do similarly
> > by adding a simple stanza into the mount.pvfs code on successful completion
> > of the mount.
> > 
> > BTW, I might add additionally that with a 'faked' PVFS entry in /etc/mtab,
> > a simple unix:
> > 
> > 	umount /pvfs
> > 
> > does the dismount just fine and removes the PVFS entry in /etc/mtab
> > (one can then stop the pvfsd and rmmod the module without problem.)
> 
> Cool.  It doesn't seem like it would be very hard to get mount.pvfs to
> create such an entry.  Any other opinions on which way we should go with
> this?
> 
> > > We'll probably never use UDP, so I wouldn't sweat that :).  It's a good
> > > point though, and I hadn't noticed that before.
> > 
> > UDP in AFS is what makes it robust :-) ( clients may timeout, but don't 
> > hang indefinitely). 
> 
> Sorry, but I don't buy this argument.  Careful use of select() and/or
> threads can get around issues of hangs in TCP code.  And implementing your
> own reliable protocol on top of UDP just creates that much more code,
> which is just that many more places for bugs to appear.
> 
> Additionally, all the reliable UDP protocols I've seen in code I have
> looked at (MPICH, LAM-MPI, PVM) have been slower than the TCP
> counterparts.
> 
> > UDP also allows AFS to be used behind firewalls (our clusters all have 
> > regular public network AFS mounts within private IP domains and can see
> > the cells in the outside world at Geneva, Italy, Hamburg, etc, as well
> > as our own Campus and local physics AFS cells).
> 
> I don't buy this either :).  Firewalls can be configured to let TCP
> through just as easily as UDP.  There may be policies in effect that limit
> the ability to use TCP, but there is nothing about TCP specifically that
> would make it inappropriate for use through firewalls.  
> 
> Is there some particular characteristic of UDP that you see as beneficial
> in the firewalled environment?  Perhaps I'm just missing something;
> wouldn't be the first time :)...
> 
> Thanks for the info, input, and sparking conversation on this list!
> 
> Rob
> ---
> Rob Ross, Mathematics and Computer Science Division, Argonne National Lab
> 
> 

--
James S. MacKinnon           Office: P-139 Avadh-Bhatia Physics Lab
Team Physics                 Voice : (780) 492-8226 [old AC 403]
University of Alberta        email : Jim.MacKinnon@Phys.UAlberta.CA
Edmonton, Canada T6G 2N5     WWW   : http://www.phys.ualberta.ca/

for all that we know the universe could cease to exist at any mo