[Pvfs2-developers] Review: experimental readcaching code
Murali Vilayannur
murali.vilayannur at gmail.com
Thu Nov 2 12:21:59 EST 2006
Hi guys,
I agree. I had no intention of checking this in at all. I only wanted
feedback on the patch.
I usually add comments only after I have tested things out prior to
checking things in..
Perhaps, I should have added it prior to sending it out for review.
sorry about that.
Thanks,
Murali
On 11/2/06, Sam Lang <slang at mcs.anl.gov> wrote:
>
> On Nov 2, 2006, at 9:12 AM, Walter B. Ligon III wrote:
>
> > Is there a particular reason this isn't documented IN THE CODE?
> > There's this thing called comments, that's what it's for!
> >
> Hi Walt,
>
> I completely agree that documentation should be a high priority, but
> aren't we all guilty of not doing it as much as we should? I've
> slogged through some pretty hairy bits of request processing and
> state machine code that I wished had a few more comments.
>
> In terms of patches, its nice to have comments in there, but in the
> end vetting should be done on the code, not the comments. I would
> argue that this should be handled on a case by case basis. If you
> get a patch you don't understand, ask for clarification as Pete has
> done in this case.
>
> -sam
>
>
> > After slogging through some of the more recent code (like the
> > client-core) I'm becomming of the opinion that we should not accept
> > patches or new code that isn't properly commented. The original
> > code base was much better (though not perfectly) commented.
> > Letting thousands of lines of code creep in uncommented like that
> > is a bad trend.
> >
> > Just a thought ...
> >
> > Walt
> >
> > Murali Vilayannur wrote:
> >> Hi Pete,
> >>> I must have misunderstood then. I thought that "pages" was a list
> >>> of all the ones needed for this particular access, while
> >>> "issue_pages" was the (possibly empty) subset that needed to have IO
> >>> started on them. Then if IO failed, you would just want to kill off
> >>> the issue_pages, not the entire set. But sounds like you know
> >>> what's going on.
> >> pages -> array of all pages needed for the access
> >> issue_pages -> array of only those pages for which I/O needs to be
> >> done
> >> page_list -> linked list of pages for which I/O is needed. this is
> >> required by the read_cache_pages() function and is drained after the
> >> cb finishes for each page.
> >> Hopefully, I have done the right thing there. I will double check
> >> today.
> >>>
> >>> >> Little stuff: no need to cast return value of kmalloc().
> >>> There's
> >>> >> also kzalloc() if you're going to do a memset(,0,) next.
> >>> >
> >>> > Is kzalloc() available on older kernels or should I add a
> >>> configure time
> >>> > option for that?
> >>> > Will change this too..
> >>>
> >>> True, it's somewhat recent. Ho hum. Could write your own kzalloc()
> >>> and use the config option, or just not bother and use the older
> >>> interfaces. I don't care. Probably either is easier than
> >>> submitting the code to the kernel so they maintain future changes
> >>> like this. :)
> >> yeah.. I think I will add the configure option and kzalloc stuff.
> >>> Or some magic per-directory opt-in xattr flag. I like that
> >>> thinking, as it parallels the immutable flag usage nicely.
> >> Exactly.
> >>>
> >>> > In order to do write buffering efficiently, we should try to
> >>> avoid doing a
> >>> > read before we modify the page and also keep track of the dirty
> >>> > ranges, but that seems like a hard thing to do from a module
> >>> without vfs
> >>> > changes. Aggregating writebacks for efficiency is also
> >>> > an issue, but I suspect we could look at the ->writepages
> >>> callback and see
> >>> > if that will work for us.
> >>>
> >>> "Avoid doing a read before we modify the page?" Wouldn't it go
> >>> something like this? I likely misunderstand again:
> >>>
> >>> read:
> >>> if IS_CTOO || IS_IMMUTABLE
> >>> if in page cache
> >>> return data, dirty or not
> >>> else
> >>> do IO
> >>> write:
> >>> if IS_CTOO
> >>> if in page cache
> >>> update data, mark dirty
> >>> else
> >>> do IO, then update and mark dirty
> >>>
> >>> and just let pagecache writeback take care of its business as
> >>> normal.
> >>>
> >>> Dirty ranges? Nah, don't bother. Just do full page granularity.
> >> Oh ok.. I guess if we do whole page granularity we might be ok.. not
> >> sure. think of the following scenario,
> >> I am mostly worried about correctness issues with caching even in the
> >> presence of non-overlapping data.
> >> Initial file size is 0.
> >> Client 1 on node 1 writes to file offsets 0 to 3000 and client 2 on
> >> node 2 from 3001 to 6000. With a page size of 4096, they both share
> >> file block 0.. What ends up on disk for file block 0 with write
> >> buffering? Sadly, we cannot punt and say user error since the program
> >> was operating with nonoverlapping offsets.
> >> Perhaps, we could ask such users/programs from not using the write
> >> buffering algorithm..
> >> Will MPI-IO programs be able to prevent this kind of thing from
> >> happening magically?
> >>>
> >>> And efficiency may just happen with the new writepages. If it
> >>> doesn't, I don't think we should go out of our way to make this
> >>> usage model particularly fast.
> >> sounds good to me.
> >>>
> >>> > BTW: Do not try the patch just yet. I have done zero testing on
> >>> it and it
> >>> > might crash your machine.
> >>>
> >>> Heh. I'm too busy crashing it myself with SCSI patches.
> >> Awesome. I am doing the same at work here :)
> >>> P.S. Your gmail produces ugly mail. Can you turn off HTML, and
> >>> maybe get it to do quoting right?
> >> I turned off html. But i have no clue how to do the latter.
> >> Thanks as always for the insightful reviews..
> >> Murali
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> Pvfs2-developers at beowulf-underground.org
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >
> > --
> > Dr. Walter B. Ligon III
> > Associate Professor
> > ECE Department
> > Clemson University
> > _______________________________________________
> > Pvfs2-developers mailing list
> > Pvfs2-developers at beowulf-underground.org
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >
>
>
More information about the Pvfs2-developers
mailing list