[PVFS-developers] Re: [PVFS-users] Recompile pvfs module for SuSE 2.4.19-NUMA

Rob Ross rross at mcs.anl.gov
Mon Mar 8 15:08:29 EST 2004


On Mon, 8 Mar 2004, Claude Pignol wrote:

> Rob Ross wrote:
> 
> >Oh, I misunderstood what you were saying before.  I thought that the "few 
> >MB" was your file size, not your access size.
> >  
> >
> The problem is the I/O size not the file size.
> 
> >How many I/O servers do you have in the system?  How much memory do you 
> >have in your client?
> >  
> >
> 10 I/O servers 1GB (dedicated ffor iod)

Clients have this much RAM too?

> >These four /proc values are the default and maximum socket buffer sizes, 
> >if I understand things correctly:
> >  /proc/sys/net/core/rmem_default
> >  /proc/sys/net/core/rmem_max
> >  /proc/sys/net/core/wmem_default
> >  /proc/sys/net/core/wmem_max
> >  
> >
> r(w)mem_default is 65535
> r(w)mem_max is 131071

I would adjust these up significantly.  I've seen suggestions of as much 
as 8MB for wide area; maybe try 1MB and see how that goes?  We're much 
nicer about socket usage now, so it shouldn't be too much of a resource 
hog.

I don't think the client adjusts these, so it's going to use the default.  
The iod *does* adjust these -- see below.

> >Also, you might want to adjust the following in your iod.conf file (see 
> >man pages for details): socket_buf, access_size.
> >  
> >
> write_buf 512
> access_size 512
> socket_buf 64

I would adjust access_size up to some multiple of the new wmem_max so that 
there is a large enough memory mapped region to fill the buffer with one 
send.  Likewise for write_buf.

I would adjust socket_buf to be the same as r(w)mem_max, because that is 
what the iod will use.

> >About where does the dropoff start to occur?
> >  
> >
> I/O size of 256KB
> 
> The read rate is around 4MB/s for I/O of 1024K
> 
> Thanks
> Claude

Let me know if this helps.  Also, as a kick-start for the next stage, what 
sort of storage do you have on those nodes (single disks, SW RAID, FC 
attached, ...)?

Thanks,

Rob

> >Regards,
> >
> >Rob
> >
> >On Mon, 8 Mar 2004, Claude Pignol wrote:
> >
> >  
> >
> >>Thanks Rob,
> >>
> >>Another fact:
> >>I found that the read works very well with 64K I/O: the read speed is 
> >>better than the write speed.
> >>The read perf start degrading when I increase the I/O size
> >>
> >>I agree that there is a starting cost but there is the read ahead mechanism
> >>that speed up the disk access.
> >>I am testing with file of min 1GB
> >>
> >>I have tested with dynamic buffering (the default) and the static buffering.
> >>Same problem.
> >>How do you increase tcp buffer size?
> >>net.ipv4.tcp_rmem
> >>net.ipv4.tcp_wmem
> >>net.ipv4.tcp_mem
> >>
> >>
> >>Claude
> >>
> >>
> >>Rob Ross wrote:
> >>
> >>    
> >>
> >>>Hi Claude,
> >>>
> >>>Sorry we didn't get back to you sooner.  I'm glad that the kernel update 
> >>>fixed the problem.
> >>>
> >>>What block size (bs=XXX) are you using in your tests?
> >>>
> >>>Note that when reading no I/O can start until data is read off disk, while 
> >>>in the write case data can start moving right away.  So you may just be 
> >>>seeing startup costs.
> >>>
> >>>You could look at increasing TCP buffer sizes on your system as a first 
> >>>step.
> >>>
> >>>Regards,
> >>>
> >>>Rob
> >>>
> >>>On Mon, 8 Mar 2004, Claude Pignol wrote:
> >>>
> >>> 
> >>>
> >>>      
> >>>
> >>>>Greetings,
> >>>>
> >>>>An upgrade to 2.4.21 fixes the problem.
> >>>>Compile and start OK.
> >>>>I have noticed a performance problem in reading from PVFS.
> >>>>With big I/O (few MB) reading is around 1/3 of the performance of writing.
> >>>>Pvfs deamons with default parameters
> >>>>Reading/Writing from on node to pvfs using dd.
> >>>>I have verified the disk performance of all the 10 I/O nodes
> >>>>I have also verified the network perf to all the nodes.
> >>>>What is the best strategy/tools to address this kind of problem?
> >>>>Thanks
> >>>>
> >>>>
> >>>>Claude Pignol wrote:
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>>>>Greetings,
> >>>>>
> >>>>>I try to do a benchmark of pvfs with the SuSE 2.4.19-NUMA kernel
> >>>>>to compare with the SuSE 2.4.19-SMP kernel.
> >>>>>No problem to compile and load the pvfs.o module with the SMP kernel
> >>>>>
> >>>>>With the NUMA kernel I get 3 undefined symbols when I try to load the 
> >>>>>module
> >>>>>pvfs.o: unresolved symbol __pollwait
> >>>>>pvfs.o: unresolved symbol mem_map
> >>>>>pvfs.o: unresolved symbol iget4
> >>>>>
> >>>>>The kernel source is installed.
> >>>>>Any idea?
> >>>>>Thanks in advance
> >>>>>Claude
> >>>>>
> >>>>>
> >>>>>_______________________________________________
> >>>>>PVFS-users mailing list
> >>>>>PVFS-users at www.beowulf-underground.org
> >>>>>http://www.beowulf-underground.org/mailman/listinfo/pvfs-users
> >>>>>
> >>>>>     
> >>>>>
> >>>>>          
> >>>>>
> >>>>_______________________________________________
> >>>>PVFS-developers mailing list
> >>>>PVFS-developers at www.beowulf-underground.org
> >>>>http://www.beowulf-underground.org/mailman/listinfo/pvfs-developers
> >>>>
> >>>>
> >>>>   
> >>>>
> >>>>        
> >>>>
> >>> 
> >>>
> >>>      
> >>>
> >>-- 
> >>
> >>
> >>
> >>    
> >>
> >
> >  
> >
> 
> -- 
> 
> 
> 



More information about the PVFS-developers mailing list