[PVFS2-users] Re: Timestamp metadata, heterogenous architecture, files and directory not accessible

Phil Carns pcarns at wastedcycles.org
Tue Oct 25 19:19:32 EDT 2005


Some history on that versioning issue, here is the commit log for when 
this stuff was added:
----------------
Date: 2004/10/27 21:12:27
Author: neill
Branch: HEAD
Tag: pvfs2-0-9-0
Log:
- added a setattr debugging mask and changed most detailed setattr
   debugging to use it
- added a mkdir debugging mask and changed most detailed mkdir
   debugging to use it
- added some inlined methods in PVFS_util (as they need to be used on
   both the server and the client) for getting the current time in
   PVFS_time format, encoding a PVFS_time as a version (finer grained
   than a 'normal' PVFS_time since we can use the high 32 bits), and
   decoding the version as a PVFS_time
- added a compatibility hack that _should_ allow no noticeable
   breakage on existing storage space, but will eventually migrate to
   the slightly new storage format over time
- modified client side sys-mkdir, sys-create, and sys-symlink to
   encode the mtime as a version when passing it to the server (so it's
   transparent from the server perspective)
- modified server side get-attr to decode the version read from disk
   back into an mtime (so it's transparent from the client perspective)
- modified mkspace method to properly version newly created root and
   lost+found directories
- modified the server side mkdir operation to return -PVFS_EINVAL if
   the object attr type is not a directory object
- misc debugging changes and cleanups
-------------------

I don't believe that it has anything to do with the resolution available 
on 2.6 kernels; I think that instead that it had something to do with 
updating directory "versions" when new files are created or destroyed 
within a directory, so that there is a way to tell when readdir tokens 
need to be reset (see this mailing list thread):

http://www.beowulf-underground.org/pipermail/pvfs2-developers/2004-September/000819.html

There may be a good reason for it, but I'm not sure why the client is 
aware of the versioning information that is hidden in the mtime field. 
It seems like it would be nicer if the server hid that from everyone 
rather than make the client do some conversions and the server do others.

-Phil

Murali Vilayannur wrote:
> Hi Rob,
> 
> Sorry for not explaining what the patch does. I was sort of waiting for
> Simon to let us know if it fixed his problem :)
> 
> But anyways, what is happening is that mtime is always versioned and
> stored at the server (I don't know from which PVFS2 version things have
> changed, but earlier it used to be stored directly iirc)
> 
> So if you take a look at pvfs2_mkspace(), sys-create.sm, sys-mkdir.sm,
> sys-setattr.sm, sys-symlink.sm.
> They all store mtime like so...
> 
> /* encode the mtime as a directory version */
>     sm_p->u.xxxx.sys_attr.mtime =
>         PVFS_util_mktime_version(sm_p->u.xxx.sys_attr.mtime);
> 
> I am not sure what this change was really meant for and when it
> was introduced (Possibly microsecond
> resolution but it is a pity that this is not conveyed all the way back to
> the 2.6 kernel VFS which also understands nano-second timestamp
> resolution)
> 
> If someone can confirm that the patch fixes this problem, then we can
> either decide to drop the versioned mtime altogether or we can propagate
> it all the way back to the kernel...
> Thanks,
> Murali
> 
> 
> 
> 
>>Hi Murali,
>>
>>So what's the deal with the istime_versioned() function?  Is this
>>something that only happens some times?  Why does it happen at all?
>>
>>Thanks,
>>
>>Rob
>>
>>Murali Vilayannur wrote:
>>
>>>Hi,
>>>Apologies for combining multiple email responses into 1 email!
>>>I did not catch any of these threads earlier on....
>>>
>>>Simon, Could you try the attached patch (against latest CVS)
>>>and let us know if it fixes your problem?
>>>I was able to reproduce it on an x86_64/ia32 setup and after this patch
>>>the timestamp problem disappeared.
>>>
>>>There is still the permission problem that Ekow Otoo brought up on the
>>>list that I have not tracked down yet. I am able to reproduce it on an
>>>x86_64 box (not on an ia32 box). Ekow, could you confirm if you are
>>>seeing this problem on an x86_64 or an IA-64 machine? I dont remember
>>>seeing the platform you were using in your email...
>>>
>>>Thanks,
>>>Murali
>>>
>>>
>>>
>>>>Rob Ross wrote:
>>>>
>>>>
>>>>
>>>>>By disappears you mean that the timestamp appears normal, right?
>>>>>
>>>>>Sounds like there's something amiss in how we copy the stat results
>>>>>into the user buffer.
>>>>>
>>>>>Thanks, we'll do some investigating.
>>>>>
>>>>>Rob
>>>>
>>>>
>>>>
>>>>Yep. In summary:
>>>>
>>>>1) Create any file in pvfs2 filesystem.
>>>>2) ls -l the file  =>  consistent but erroneous timestamp (Jan 1970)
>>>>3) 'touch' the file  =>  different but erroneous timestamp (another day
>>>>in Jan 1970)
>>>>4) umount pvfs2,  mount pvfs2  =>  same 1970 timestamp
>>>>5) umount, kill pvfs2-client, start pvfs2-client, mount pvfs2  =>
>>>>correct timestamp (i.e. last 'touch' time)
>>>>
>>>
>>>
>>>------------------------------------------------------------------------
>>>
>>>Index: include/pvfs2-util.h
>>>===================================================================
>>>RCS file: /anoncvs/pvfs2/include/pvfs2-util.h,v
>>>retrieving revision 1.38
>>>diff -u -r1.38 pvfs2-util.h
>>>--- include/pvfs2-util.h	4 Oct 2005 19:04:53 -0000	1.38
>>>+++ include/pvfs2-util.h	25 Oct 2005 00:51:09 -0000
>>>@@ -145,6 +145,11 @@
>>> {
>>>     return (PVFS_time)(version >> 32);
>>> }
>>>+
>>>+inline static int PVFS_util_istime_versioned(PVFS_time version_or_time)
>>>+{
>>>+    return (version_or_time >> 32 == 0 ? 0 : 1);
>>>+}
>>> #endif /* __KERNEL__ */
>>>
>>> #endif /* __PVFS2_UTIL_H */
>>>Index: src/client/sysint/acache.c
>>>===================================================================
>>>RCS file: /anoncvs/pvfs2/src/client/sysint/acache.c,v
>>>retrieving revision 1.20
>>>diff -u -r1.20 acache.c
>>>--- src/client/sysint/acache.c	23 Aug 2005 18:44:13 -0000	1.20
>>>+++ src/client/sysint/acache.c	25 Oct 2005 00:51:10 -0000
>>>@@ -18,6 +18,7 @@
>>> #include "acache.h"
>>> #include "quickhash.h"
>>> #include "pint-util.h"
>>>+#include "pvfs2-util.h"
>>>
>>> /* comment out the following for non-verbose acache debugging */
>>> #define VERBOSE_ACACHE_DEBUG
>>>@@ -369,7 +370,12 @@
>>>     pinode->refn.fs_id = refn.fs_id;
>>>     pinode->refn.handle = refn.handle;
>>>
>>>-    PINT_copy_object_attr(&pinode->attr, attr);
>>>+    PINT_copy_object_attr(&pinode->attr, attr,
>>>+            PVFS_util_istime_versioned(attr->mtime));
>>>+    gossip_debug(GOSSIP_ACACHE_DEBUG, "acache inserted handle %Ld "
>>>+            "atime: %Ld, mtime: %Ld, ctime: %Ld\n", pinode->refn.handle,
>>>+            pinode->attr.atime, PVFS_util_mkversion_time(pinode->attr.mtime),
>>>+            pinode->attr.ctime);
>>>
>>>     PINT_acache_set_valid(pinode);
>>>
>>>Index: src/client/sysint/sys-getattr.sm
>>>===================================================================
>>>RCS file: /anoncvs/pvfs2/src/client/sysint/sys-getattr.sm,v
>>>retrieving revision 1.85
>>>diff -u -r1.85 sys-getattr.sm
>>>--- src/client/sysint/sys-getattr.sm	10 Oct 2005 16:28:06 -0000	1.85
>>>+++ src/client/sysint/sys-getattr.sm	25 Oct 2005 00:51:10 -0000
>>>@@ -405,7 +405,10 @@
>>>
>>>           cache_hit:
>>>
>>>-            PINT_copy_object_attr(&sm_p->getattr.attr, &pinode->attr);
>>>+            PINT_copy_object_attr(&sm_p->getattr.attr, &pinode->attr,
>>>+                    PVFS_util_istime_versioned(pinode->attr.mtime));
>>>+            gossip_debug(GOSSIP_ACACHE_DEBUG, "Acache HIT! for %s: handle %Lu fsid %d, mtime %Ld\n",
>>>+                  __func__, Lu(object_ref.handle), object_ref.fs_id, Ld(pinode->attr.mtime));
>>>
>>>             if(trimmed_mask & PVFS_ATTR_DATA_SIZE)
>>>             {
>>>@@ -518,8 +521,8 @@
>>>      * then we can make a copy of the retrieved attribute for later
>>>      * caching.
>>>      */
>>>-    PINT_copy_object_attr(&sm_p->getattr.attr,
>>>-                          &resp_p->u.getattr.attr);
>>>+    PINT_copy_object_attr(&sm_p->getattr.attr, &resp_p->u.getattr.attr,
>>>+        PVFS_util_istime_versioned(resp_p->u.getattr.attr.mtime));
>>>
>>>     attr =  &sm_p->getattr.attr;
>>>     assert(attr);
>>>@@ -782,9 +785,13 @@
>>>         sm_p->getattr.attr.mask |= PVFS_ATTR_DATA_SIZE;
>>>     }
>>>
>>>-    PINT_copy_object_attr(&pinode->attr, &sm_p->getattr.attr);
>>>+    PINT_copy_object_attr(&pinode->attr, &sm_p->getattr.attr,
>>>+            PVFS_util_istime_versioned(sm_p->getattr.attr.mtime));
>>>
>>>     PINT_acache_set_valid(pinode);
>>>+    gossip_debug(GOSSIP_ACACHE_DEBUG, "Acache INSERT! for %s: handle %Lu fsid %d, mtime %Ld\n",
>>>+          __func__, Lu(sm_p->getattr.object_ref.handle), sm_p->getattr.object_ref.fs_id,
>>>+          Ld(pinode->attr.mtime));
>>>
>>>     if (release_required)
>>>     {
>>>Index: src/client/sysint/sys-lookup.sm
>>>===================================================================
>>>RCS file: /anoncvs/pvfs2/src/client/sysint/sys-lookup.sm,v
>>>retrieving revision 1.57
>>>diff -u -r1.57 sys-lookup.sm
>>>--- src/client/sysint/sys-lookup.sm	26 Aug 2005 19:10:56 -0000	1.57
>>>+++ src/client/sysint/sys-lookup.sm	25 Oct 2005 00:51:10 -0000
>>>@@ -737,7 +737,7 @@
>>>
>>>     PINT_free_object_attr(&(cur_seg->seg_attr));
>>>     PINT_copy_object_attr(&(cur_seg->seg_attr),
>>>-                          &(sm_p->getattr.attr));
>>>+                          &(sm_p->getattr.attr), PVFS_util_istime_versioned(sm_p->getattr.attr.mtime));
>>>
>>>     cur_ctx = GET_CURRENT_CONTEXT(sm_p);
>>>     assert(cur_ctx);
>>>@@ -1112,7 +1112,8 @@
>>>             PINT_free_object_attr(&(cur_seg->seg_attr));
>>>             PINT_copy_object_attr(
>>>                 &(cur_seg->seg_attr),
>>>-                &(resp_p->u.lookup_path.attr_array[i]));
>>>+                &(resp_p->u.lookup_path.attr_array[i]),
>>>+                PVFS_util_istime_versioned(resp_p->u.lookup_path.attr_array[i].mtime));
>>>         }
>>>     }
>>>     assert(i == resp_p->u.lookup_path.handle_count);
>>>Index: src/common/misc/pint-util.c
>>>===================================================================
>>>RCS file: /anoncvs/pvfs2/src/common/misc/pint-util.c,v
>>>retrieving revision 1.9
>>>diff -u -r1.9 pint-util.c
>>>--- src/common/misc/pint-util.c	23 Aug 2005 18:44:17 -0000	1.9
>>>+++ src/common/misc/pint-util.c	25 Oct 2005 00:51:10 -0000
>>>@@ -25,6 +25,7 @@
>>> #include "gen-locks.h"
>>> #include "gossip.h"
>>> #include "pvfs2-debug.h"
>>>+#include "pvfs2-util.h"
>>>
>>> static int current_tag = 1;
>>> static gen_mutex_t current_tag_lock = GEN_MUTEX_INITIALIZER;
>>>@@ -92,7 +93,7 @@
>>>     return ret;
>>> }
>>>
>>>-int PINT_copy_object_attr(PVFS_object_attr *dest, PVFS_object_attr *src)
>>>+int PINT_copy_object_attr(PVFS_object_attr *dest, PVFS_object_attr *src, int convert_mtime)
>>> {
>>>     int ret = -PVFS_ENOMEM;
>>>
>>>@@ -120,7 +121,10 @@
>>>         }
>>>         if (src->mask & PVFS_ATTR_COMMON_MTIME)
>>>         {
>>>-            dest->mtime = src->mtime;
>>>+            if (convert_mtime)
>>>+                dest->mtime = PVFS_util_mkversion_time(src->mtime);
>>>+            else
>>>+                dest->mtime = src->mtime;
>>>         }
>>> 	if (src->mask & PVFS_ATTR_COMMON_TYPE)
>>>         {
>>>Index: src/common/misc/pint-util.h
>>>===================================================================
>>>RCS file: /anoncvs/pvfs2/src/common/misc/pint-util.h,v
>>>retrieving revision 1.7
>>>diff -u -r1.7 pint-util.h
>>>--- src/common/misc/pint-util.h	8 Jun 2005 19:30:29 -0000	1.7
>>>+++ src/common/misc/pint-util.h	25 Oct 2005 00:51:10 -0000
>>>@@ -38,7 +38,7 @@
>>>
>>> PVFS_msg_tag_t PINT_util_get_next_tag(void);
>>>
>>>-int PINT_copy_object_attr(PVFS_object_attr *dest, PVFS_object_attr *src);
>>>+int PINT_copy_object_attr(PVFS_object_attr *dest, PVFS_object_attr *src, int convert_mtime);
>>> void PINT_free_object_attr(PVFS_object_attr *attr);
>>> void PINT_time_mark(PINT_time_marker* out_marker);
>>> void PINT_time_diff(PINT_time_marker mark1,
>>
>>
> _______________________________________________
> PVFS2-users mailing list
> PVFS2-users at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users



More information about the PVFS2-users mailing list