[Pvfs2-developers] Rebalancing after adding additional nodes

Ti Leggett leggett at mcs.anl.gov
Sat Aug 26 11:17:05 EDT 2006


I've written a quick and dirty script that, in theory, should  
rebalance files across new IO nodes. In practice though, what  
happened was it actually filled my file system. Here's the deal. I  
had a cluster of 4 nodes, each with a 250G disk dedicated for PVFS2  
giving me a PVFS2 volume of about 900G. Before I added nodes I had  
about 108G free (or about 27G free on each node). I added 4 more  
nodes identical to the original 4, each with 250G PVFS2 disks. After  
extending my PVFS2 volume, I had double the space, but my free space  
was also only doubled because the first node (one of the originals)  
still had only 27G free. Also, all the original files were only  
distributed across the original 4 nodes. Enter my script. The purpose  
is to rebalance files across all nodes so that a) they're evenly  
distributed across the nodes b) space is evenly freed across the  
nodes and c) free space is accurately reported.

Here's how it works. You do the following to get a list of all the  
files to rebalance:

du -a PVFS2_FS | sort -n | sed 's/^[0-9][0-9]*\t\t*//' > SOME_TEMP_FILE

Where PVFS2_FS is the mounted PVFS2 volume and SOME_TEMP_FILE is just  
that.

The theory is that for fairly evenly mix of small and large files on  
the system, if you rebalance the smaller files first it will free up  
space that can be used to rebalance the large files. This only  
matters if you're not adding space that is equal in size (or actually  
slightly larger) than your largest file. This is a kind of an inverse  
Brazil nut effect. What this should do is as all these files are  
rebalanced across all the nodes, available space should start  
increasing on the original 4 nodes and therefor on the volume as a  
whole. I've attached the script for comment.

In actuality, my volume filled up while rebalancing and curiously the  
new nodes now have as much space taken up on the their PVFS2 disks as  
was available on each of the original nodes prior to the rebalancing.  
That is, each new node's disk has about 27-28G used. I'm not sure how  
this is possible. I've checked that I didn't actually create or  
duplicate each rebalanced file by comparing the filenames before  
rebalancing and after the attempt and they're the same. So there are  
no new files on the volume. I've verified that the rebalanced files  
were distributed across only the original 4 nodes before rebalancing  
and over all 8 nodes after using pvfs2-viewdist. So I'm completely  
baffled how rebalancing actually used up more space seeing as no new  
files were actually added. Anyone have any ideas? Anyone see anything  
the matter with my rebalancing script? Thanks!

-------------- next part --------------
A non-text attachment was scrubbed...
Name: pvfs2-rebalance
Type: application/octet-stream
Size: 1914 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20060826/0269771e/pvfs2-rebalance.obj


More information about the Pvfs2-developers mailing list