[Pvfs2-developers] Review: readcaching code

Sam Lang slang at mcs.anl.gov
Mon Aug 20 01:57:33 EDT 2007


On Aug 19, 2007, at 10:55 PM, Murali Vilayannur wrote:

> Hi Sam,
>
>> This of course doesn't map well to distributed file systems.  How are
>> you going to deal with cached files once the immutable attribute is
>> removed?
>
> Hmm.. yeah that is a problem..
> I realize that RobR is against adding state at the servers and I think
> he has perfectly valid reasons for that but it will be quite easy to
> do the cache invalidation if we
> - kept that state (of who all opened an immutable file) as a hint
> (even if servers crash and restart, clients can detect that and have
> them bubble up this information to kmod and have it wipe the cache
> clean)
> - we allow client-core or kmod to have a listening socket for such  
> messages.
>
> In any case, detecting the case of an immutable bit being removed from
> some node in the cluster for subsequent opens of the file is trivial..
> We only need to worry about the case
> of files that have already been opened. Without the invalidation based
> approach, there is no hope for solving this problem then :(
>
>>
>> I've attached a patch that adds the ioctls to enable setting the
>> immutable with the chattr command.
>
> Awesome! THanks for doing this! A few comments.
>
> - suggest you replace sizeof(uint64_t) with sizeof(val)
> -  pvfs2_xattr_get_default and pvfs2_xattr_set_default return either 0
> or -ve number. They don't return > 0. so the check can else if (ret ==
> 0)

Actually pvfs2_xattr_get_default calls pvfs2_inode_getxattr which  
returns the size of the extended attribute.

> - This looks incorrect:
>
>  if(ret >= 0)
> +        {
> +            return put_user(val, (int __user *)arg);
> +        }

I pulled that out of the ext3 ioctl code.  Seems to work.  Without it  
lsattr gives bad results.

>
> Flags:
> PVFS_IMMUTABLE_FL != FS_IMMUTABLE_FL
> PVFS_APPEND_FL != FS_APPEND_FL
> PVFS_NOATIME_FL != FS_NOATIME_FL

Yeah, why don't they?

>
> Before the put_user(), you should convert val from a PVFS_*_FL to a
> FS_*_FL flag I think.
> ELse the chattr utility won't understand these flags..
> - if(arg & FS_APPEND_FL)
> +        {
> +            val |= PVFS_IMMUTABLE_FL; <--- PVFS_APPEND_FL
> +        }
>
> - should XATTR_CREATE simply be 0?
> XATTR_CREATE will fail if a similar xattr already exists I think.

Ok.  What does 0 do if one doesn't already exist?  There's  
XATTR_REPLACE too which suggests that you either need one or the other.

> - I think you need to set ret = 0 if(val == 0) in the  
> FS_IOC_SETFLAGS path.
>
Agreed.

> - Btw: Did you test this on x86_64 with 32 bit user-space? DO we need
> to implement FS_IOC32_SETFLAGS
> or FS_IOC32_GETFLAGS?

No idea.  Its tested on x86_64 with 64 bit userspace.

-sam
>
> thanks,
> Murali
>
>>
>> -sam
>>
>>
>>
>>
>>> thanks,
>>> Murali
>>>
>>>> Maybe I'm missing something but I think deleting the file should be
>>>> allowed, in fact it should be the only way to remove the immutable
>>>> attribute.
>>>>
>>>> -sam
>>>>
>>>>> thanks,
>>>>> Murali
>>>>>>
>>>>>> -sam
>>>>>>
>>>>>> On Aug 17, 2007, at 10:09 PM, Murali Vilayannur wrote:
>>>>>>
>>>>>>> Sam,
>>>>>>> The problem is not in the system call (fsetxattr) but the
>>>>>>> arguments
>>>>>>> to it..
>>>>>>> user.pvfs2.meta_hint is the key and val is actually a uint64
>>>>>>> which is
>>>>>>> a bitwise OR
>>>>>>> of PVFS_IMMUTABLE_FL, other pvfs flags.
>>>>>>> modify_val() in pvfs2-xattr.c will give an example of this  
>>>>>>> usage.
>>>>>>> Sorry, it is a little convoluted ..:(
>>>>>>> but I couldn't/didn't want to do more string parsing on server
>>>>>>> side.
>>>>>>> Feel free to change that if you think it is needlessly  
>>>>>>> convoluted.
>>>>>>> thanks,
>>>>>>> Murali
>>>>>>>
>>>>>>> PS: let me know how the caching patches work out :)
>>>>>>> I havent had too much time to play with it since Feb though.
>>>>>>> Hope it works :)
>>>>>>>
>>>>>>>
>>>>>>> On 8/17/07, Sam Lang <slang at mcs.anl.gov> wrote:
>>>>>>>>
>>>>>>>> Hi Murali,
>>>>>>>>
>>>>>>>> I wrote a little program to test the performance of the read-
>>>>>>>> caching
>>>>>>>> immutable file stuff.  With the  attached program, I get a  
>>>>>>>> EINVAL
>>>>>>>> error on the read of the file after the immutable attribute has
>>>>>>>> been
>>>>>>>> set (using fsetxattr).  Also, ls -la gives me really strange
>>>>>>>> results
>>>>>>>> for the files that I've set that immutable attribute on.  In  
>>>>>>>> the
>>>>>>>> below listing, tmpfile1 and tmpfile10 didn't have the immutable
>>>>>>>> attribute set.  It looks like the problem is with the fsetxattr
>>>>>>>> system call.  The setfattr util does the same thing.  When I  
>>>>>>>> set
>>>>>>>> the
>>>>>>>> xattr with pvfs2-xattr though, I don't see the corruption in
>>>>>>>> listing
>>>>>>>> the file.  I'll try to investigate what fsetxattr is doing,
>>>>>>>> but are
>>>>>>>> you aware of any problems with using the system call?
>>>>>>>>
>>>>>>>> -sam
>>>>>>>>
>>>>>>>> root at bb22:/tmp/pvfsmnt# ls -la
>>>>>>>> total 10260
>>>>>>>> drwxrwxrwt 1 slang mpi      4096 2007-08-17 16:35 .
>>>>>>>> drwxrwxrwt 5 root  root     4096 2007-08-17 15:47 ..
>>>>>>>> drwxrwxrwx 1 slang mpi      4096 2007-08-17 15:47 lost+found
>>>>>>>> -rw-r--r-- 1 root  root        0 2007-08-17 16:24 tmpfile1
>>>>>>>> -rw-r--r-- 1 root  root 10485760 2007-08-17 16:34 tmpfile10
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile11
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile2
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile3
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile4
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile5
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile6
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile7
>>>>>>>> ?--------- ? ?     ?           ?                ? tmpfile9
>>>>>>>> root at bb22:/tmp/pvfsmnt#
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Feb 20, 2007, at 1:06 AM, Murali Vilayannur wrote:
>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>> Finally, I got some time to whip up the read-caching  
>>>>>>>>> patches for
>>>>>>>>> non-mutable files into a semblance of shape and stability.
>>>>>>>>> With this patch, I am able to get I/Os to a file (marked
>>>>>>>>> immutable)
>>>>>>>>> serviced from the page-cache. One can tag a file as  
>>>>>>>>> immutable by
>>>>>>>>> running,
>>>>>>>>> ./src/apps/admin/pvfs2-xattr -s -k user.pvfs2.meta_hint -v
>>>>>>>>> "+immutable" /path/to/pvfs2-file
>>>>>>>>> To verify if a file is indeed tagged immutable,
>>>>>>>>> ./src/apps/admin/pvfs2-xattr -t -k user.pvfs2.meta_hint /path/
>>>>>>>>> to/
>>>>>>>>> pvfs2-file
>>>>>>>>> (or)
>>>>>>>>> ./src/apps/admin/pvfs2-stat /path/to/pvfs2/file
>>>>>>>>>
>>>>>>>>> I have also added some preliminary statistics exported via
>>>>>>>>> /proc/sys/pvfs2/stats/
>>>>>>>>> that can be used as a placeholder for more interesting
>>>>>>>>> statistics
>>>>>>>>> later on.
>>>>>>>>> Currently, it only shows # of reads, writes, hits in thepage-
>>>>>>>>> cache
>>>>>>>>> and misses.
>>>>>>>>>
>>>>>>>>> For some reason now, cache hits do not happen across a file
>>>>>>>>> close.
>>>>>>>>> Within a file open-close session, all reads get serviced from
>>>>>>>>> the
>>>>>>>>> cache though. Very weird.
>>>>>>>>> My hunch is that file pages are somehow getting removed  
>>>>>>>>> from the
>>>>>>>>> radix
>>>>>>>>> tree of the address space due to some page-ref counting
>>>>>>>>> issues. I
>>>>>>>>> will
>>>>>>>>> dig into this later this week.
>>>>>>>>>
>>>>>>>>> In any case, this code should not cause any regression of  
>>>>>>>>> older
>>>>>>>>> code
>>>>>>>>> paths (hopefully!) and should not impose any performance
>>>>>>>>> penalties for
>>>>>>>>> workloads making use of the page-cache because of the way we
>>>>>>>>> aggregate
>>>>>>>>> cache miss I/Os to the server.
>>>>>>>>> It was really nice to be able to make use of the iox()
>>>>>>>>> infrastructure
>>>>>>>>> that was already in place to service non-contigous file and
>>>>>>>>> memory
>>>>>>>>> I/O.
>>>>>>>>> More details of the implementation is described in the thread
>>>>>>>>> below.
>>>>>>>>> http://www.beowulf-underground.org/pipermail/pvfs2-developers/
>>>>>>>>> 2006-
>>>>>>>>> November/002847.html
>>>>>>>>> Hopefully, I have addressed most of Pete's comments.
>>>>>>>>> More comments and testing welcome!
>>>>>>>>> thanks,
>>>>>>>>> Murali
>>>>>>>>> <read-cache-5.patch>
>>>>>>>>> _______________________________________________
>>>>>>>>> Pvfs2-developers mailing list
>>>>>>>>> Pvfs2-developers at beowulf-underground.org
>>>>>>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-
>>>>>>>>> developers
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>>
>>
>



More information about the Pvfs2-developers mailing list