[PVFS-developers]
Re: [PVFS-users] Recompile pvfs module for SuSE 2.4.19-NUMA
Rob Ross
rross at mcs.anl.gov
Mon Mar 8 17:32:34 EST 2004
Hey,
What's your strip size default?
So adjusting those parameters did have a positive effect for many cases,
but the 256KB read case is still bad?
Is it consistently bad for ever-larger sizes, or is that particular size a
bad one?
Thanks,
Rob
On Mon, 8 Mar 2004, Claude Pignol wrote:
> Rob,
>
>
> I/O 64KB no problem
> I/O 128KB no problem
> I/O 256KB write no problem and read 10 times slower.
> The tuning of the parameters helps to get a better performance when it
> works normally,
> but with the I/O of 256K pvfs doesn't behave normally.
> The current parameters are
> r(w)mem_max 1048575
> write_buf 4096
> access_size 4096
> socket_buf 1024
> No error message in the pvfs log
>
> Disks: raid disk that can deliver 30MB/s
> Dedicated to pvfs data
>
> Regards
> Claude
>
>
>
>
>
> Rob Ross wrote:
>
> >On Mon, 8 Mar 2004, Claude Pignol wrote:
> >
> >
> >
> >>Rob Ross wrote:
> >>
> >>
> >>
> >>>Oh, I misunderstood what you were saying before. I thought that the "few
> >>>MB" was your file size, not your access size.
> >>>
> >>>
> >>>
> >>>
> >>The problem is the I/O size not the file size.
> >>
> >>
> >>
> >>>How many I/O servers do you have in the system? How much memory do you
> >>>have in your client?
> >>>
> >>>
> >>>
> >>>
> >>10 I/O servers 1GB (dedicated ffor iod)
> >>
> >>
> >
> >Clients have this much RAM too?
> >
> >
> >
> >>>These four /proc values are the default and maximum socket buffer sizes,
> >>>if I understand things correctly:
> >>> /proc/sys/net/core/rmem_default
> >>> /proc/sys/net/core/rmem_max
> >>> /proc/sys/net/core/wmem_default
> >>> /proc/sys/net/core/wmem_max
> >>>
> >>>
> >>>
> >>>
> >>r(w)mem_default is 65535
> >>r(w)mem_max is 131071
> >>
> >>
> >
> >I would adjust these up significantly. I've seen suggestions of as much
> >as 8MB for wide area; maybe try 1MB and see how that goes? We're much
> >nicer about socket usage now, so it shouldn't be too much of a resource
> >hog.
> >
> >I don't think the client adjusts these, so it's going to use the default.
> >The iod *does* adjust these -- see below.
> >
> >
> >
> >>>Also, you might want to adjust the following in your iod.conf file (see
> >>>man pages for details): socket_buf, access_size.
> >>>
> >>>
> >>>
> >>>
> >>write_buf 512
> >>access_size 512
> >>socket_buf 64
> >>
> >>
> >
> >I would adjust access_size up to some multiple of the new wmem_max so that
> >there is a large enough memory mapped region to fill the buffer with one
> >send. Likewise for write_buf.
> >
> >I would adjust socket_buf to be the same as r(w)mem_max, because that is
> >what the iod will use.
> >
> >
> >
> >>>About where does the dropoff start to occur?
> >>>
> >>>
> >>>
> >>>
> >>I/O size of 256KB
> >>
> >>The read rate is around 4MB/s for I/O of 1024K
> >>
> >>Thanks
> >>Claude
> >>
> >>
> >
> >Let me know if this helps. Also, as a kick-start for the next stage, what
> >sort of storage do you have on those nodes (single disks, SW RAID, FC
> >attached, ...)?
> >
> >Thanks,
> >
> >Rob
> >
> >
> >
> >>>Regards,
> >>>
> >>>Rob
> >>>
> >>>On Mon, 8 Mar 2004, Claude Pignol wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Thanks Rob,
> >>>>
> >>>>Another fact:
> >>>>I found that the read works very well with 64K I/O: the read speed is
> >>>>better than the write speed.
> >>>>The read perf start degrading when I increase the I/O size
> >>>>
> >>>>I agree that there is a starting cost but there is the read ahead mechanism
> >>>>that speed up the disk access.
> >>>>I am testing with file of min 1GB
> >>>>
> >>>>I have tested with dynamic buffering (the default) and the static buffering.
> >>>>Same problem.
> >>>>How do you increase tcp buffer size?
> >>>>net.ipv4.tcp_rmem
> >>>>net.ipv4.tcp_wmem
> >>>>net.ipv4.tcp_mem
> >>>>
> >>>>
> >>>>Claude
> >>>>
> >>>>
> >>>>Rob Ross wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>Hi Claude,
> >>>>>
> >>>>>Sorry we didn't get back to you sooner. I'm glad that the kernel update
> >>>>>fixed the problem.
> >>>>>
> >>>>>What block size (bs=XXX) are you using in your tests?
> >>>>>
> >>>>>Note that when reading no I/O can start until data is read off disk, while
> >>>>>in the write case data can start moving right away. So you may just be
> >>>>>seeing startup costs.
> >>>>>
> >>>>>You could look at increasing TCP buffer sizes on your system as a first
> >>>>>step.
> >>>>>
> >>>>>Regards,
> >>>>>
> >>>>>Rob
> >>>>>
> >>>>>On Mon, 8 Mar 2004, Claude Pignol wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Greetings,
> >>>>>>
> >>>>>>An upgrade to 2.4.21 fixes the problem.
> >>>>>>Compile and start OK.
> >>>>>>I have noticed a performance problem in reading from PVFS.
> >>>>>>With big I/O (few MB) reading is around 1/3 of the performance of writing.
> >>>>>>Pvfs deamons with default parameters
> >>>>>>Reading/Writing from on node to pvfs using dd.
> >>>>>>I have verified the disk performance of all the 10 I/O nodes
> >>>>>>I have also verified the network perf to all the nodes.
> >>>>>>What is the best strategy/tools to address this kind of problem?
> >>>>>>Thanks
> >>>>>>
> >>>>>>
> >>>>>>Claude Pignol wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Greetings,
> >>>>>>>
> >>>>>>>I try to do a benchmark of pvfs with the SuSE 2.4.19-NUMA kernel
> >>>>>>>to compare with the SuSE 2.4.19-SMP kernel.
> >>>>>>>No problem to compile and load the pvfs.o module with the SMP kernel
> >>>>>>>
> >>>>>>>With the NUMA kernel I get 3 undefined symbols when I try to load the
> >>>>>>>module
> >>>>>>>pvfs.o: unresolved symbol __pollwait
> >>>>>>>pvfs.o: unresolved symbol mem_map
> >>>>>>>pvfs.o: unresolved symbol iget4
> >>>>>>>
> >>>>>>>The kernel source is installed.
> >>>>>>>Any idea?
> >>>>>>>Thanks in advance
> >>>>>>>Claude
> >>>>>>>
> >>>>>>>
> >>>>>>>_______________________________________________
> >>>>>>>PVFS-users mailing list
> >>>>>>>PVFS-users at www.beowulf-underground.org
> >>>>>>>http://www.beowulf-underground.org/mailman/listinfo/pvfs-users
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>_______________________________________________
> >>>>>>PVFS-developers mailing list
> >>>>>>PVFS-developers at www.beowulf-underground.org
> >>>>>>http://www.beowulf-underground.org/mailman/listinfo/pvfs-developers
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>--
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>--
> >>
> >>
> >>
> >>
> >>
> >
> >_______________________________________________
> >PVFS-developers mailing list
> >PVFS-developers at www.beowulf-underground.org
> >http://www.beowulf-underground.org/mailman/listinfo/pvfs-developers
> >
> >
> >
>
>
>
>
>
>
>
More information about the PVFS-developers
mailing list