From kschoche at gmail.com Thu Feb 2 16:12:24 2012 From: kschoche at gmail.com (Kyle Schochenmaier) Date: Thu Feb 2 16:12:14 2012 Subject: [Pvfs2-developers] Re: [Pvfs2-users] no connection on infiniband dual port card In-Reply-To: References: Message-ID: Hi Randy - I'm not familiar with any of the pushes towards HA so I'm not sure where you stand at this point, or how we plan on implementing that in the future but I wouldnt mind contributing once there is a gameplan in place. But, looking at Vlads request, to be able to pass a hint or option to pvfs to define which IB port to use, I"m not really sure where to add this... Where's the best place to put the config param? in the config file? command line? And then how do we push that all the way down to the bmi_ib code? Vlad - I threw this together today, it will pick port number 2 instead of port number 1 if it exists, but it isnt configurable yet, still pending comments above. It compiles but I don't have the hardware anymore to test it. This approach completely disregards the fact that an HCA may have more than 2 ports, but once we figure out how we want to be able to pass the parameter down to bmi we can fix that with 1 line. Can you test it out? Thanks ~Kyle test@test:~/pvfs/orangefs$ cat port_number.patch Index: src/io/bmi/bmi_ib/openib.c =================================================================== --- src/io/bmi/bmi_ib/openib.c (revision 9182) +++ src/io/bmi/bmi_ib/openib.c (working copy) @@ -899,8 +899,22 @@ ib_device->func.check_async_events = openib_check_async_events; od->ctx = ctx; - od->nic_port = IBV_PORT; /* maybe let this be configurable */ + /* Query the device for the max_ requests and such */ + ret = ibv_query_device(od->ctx, &hca_cap); + if (ret) + error_xerrno(ret, "%s: ibv_query_device", __func__); + VALGRIND_MAKE_MEM_DEFINED(&hca_cap, sizeof(hca_cap)); + + /* Try to see if we can bring up more than one port instead of hc it */ + if((int)hca_cap.phys_port_cnt > 1) + { + /* parse in the port number request here */ + od->nic_port=2; + } + else + od->nic_port = IBV_PORT; /* default to port 1 */ + /* get the lid and verify port state */ ret = ibv_query_port(od->ctx, od->nic_port, &hca_port); if (ret) @@ -913,12 +927,8 @@ error("%s: port state is %s but should be ACTIVE; check subnet manager", __func__, openib_port_state_string(hca_port.state)); - /* Query the device for the max_ requests and such */ - ret = ibv_query_device(od->ctx, &hca_cap); - if (ret) - error_xerrno(ret, "%s: ibv_query_device", __func__); - VALGRIND_MAKE_MEM_DEFINED(&hca_cap, sizeof(hca_cap)); - + /* used to query device here, but we queried these up above to define + port number */ debug(1, "%s: max %d completion queue entries", __func__, hca_cap.max_cq); cqe_num = IBV_NUM_CQ_ENTRIES; od->nic_max_sge = hca_cap.max_sge; Kyle Schochenmaier On Tue, Jan 31, 2012 at 1:33 PM, Randall Martin wrote: > Kyle, > > Yes I think we need some form of fail-over capability with multi-port NICs > in orangefs for HA. ?As the number of I/O servers grow, the odds of some > kind of hardware failure increases. ?Network errors in this brave new > world should be expected and tolerated as much as possible. ?This might be > a good step in that direction. > > -Randy > > On 1/31/12 10:27 AM, "Kyle Schochenmaier" wrote: > >>Hi Vlad, All - >> >>A couple comments.. >>You can probably just hardcode to port 2 to force things onto port 2, >>feel free to test it out, just be sure to rebuild and push out all of >>the client and server binaries so they all play nicely with eachother. >> >>Also, I don't think we can implement port bonding at this level, it >>would require quite a bit of work and synchronization which could put >>in enough overhead to make it not perform significantly faster.. I >>guess the only way to tell would be to try it but Im going to suspect >>that it might not be beneficial. >> >>Now, >>The other thing you mentioned had to do with port fail-over I >>believe..this is why I'm bringing in the dev list here. ?Currently I >>believe the standard practice across all interconnects using bmi is to >>have a hard fail whenever a particular port configuration fails to >>come up initially. >> >>But I know we're going to be making a push into HA with orangefs soon >>so I am wondering what peoples thoughts are here? >>Is this something that would need to be implemented anyways, does it >>fit the HA scheme that is being examined for orangefs? >>Thoughts? >> >> >>Kyle Schochenmaier >> >> >> >>On Tue, Jan 31, 2012 at 2:25 AM, vlad wrote: >>> Dear Kyle, >>> >>> >>>> I dont think we ever got around to testing this with multiple ports >>>> active on each HCA when we wrote it, so I believe I hard coded it to >>>> just default to the first port... iirc we tried to bring up the 2nd >>>> port at one point and found that there were some memory exhaustion >>>> issues when using more than one port AND the default HCA >>>> buffers/MTUs/etc on the cards that this was primarily tested on so it >>>> went back to 1. >>>> >>>> I wouldn't recommend changing this via a hard code for obvious >>>> reasons, but at the same time it probably wouldn't take more than >>>> 20-30 lines of code to fix this up to take more than one port. ?I'll >>>> try to take a look at it. >>> >>> >>> Thanks, I never thought about bonding the 2 infiniband ports. >>> >>> It is absolutely sufficient for me to swap port 1 for port 2. I never >>>had >>> the intention of using both ports simultaneusly. >>> Could this be achieved by changing ? IBV_PORT to "2" instead of "1" ? >>> >>> For the new code wishlist (if I may ask for it ..): >>> >>> It would be nice to be able to define the infiniband port in the config >>> file and to have a defined fallback to the other infiniband port, if the >>> 1st one does not work. >>> >>> (example connection should go over ib0,ib1://host:port/service) >>> >>> But that is not very urgent to have . >>> >>> Thanks for all the help, >>> >>> Greetings, >>> >>> Vlad >> >>_______________________________________________ >>Pvfs2-developers mailing list >>Pvfs2-developers@beowulf-underground.org >>http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > From dimstamat at gmail.com Tue Feb 14 21:02:50 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Tue Feb 14 21:02:40 2012 Subject: [Pvfs2-developers] pvfs2 tests Message-ID: Hello all, how can I compile the pvfs2 tests? There is no Makefile there, and the default Makefile does not seem to have an option for building these tests. I basically want to run my own tests, but I don't know which libraries to include... For example it says : undefined reference to `PVFS_BYTE' so I maybe need to compile together my test with the src/io/description/pvfs-request.c where the PVFS_BYTE is defined. Many thanks, Dimos. From elaine at omnibond.com Tue Feb 14 22:55:01 2012 From: elaine at omnibond.com (Elaine Quarles) Date: Tue Feb 14 22:53:59 2012 Subject: [Pvfs2-developers] pvfs2 tests In-Reply-To: References: Message-ID: <042301cceb95$987d6170$c9782450$@com> In the test directory, run ./configure to generate the Makefile for the tests. Hope that helps, Elaine -----Original Message----- From: pvfs2-developers-bounces@beowulf-underground.org [mailto:pvfs2-developers-bounces@beowulf-underground.org] On Behalf Of Dimos Stamatakis Sent: Tuesday, February 14, 2012 9:03 PM To: pvfs2-developers@beowulf-underground.org Subject: [Pvfs2-developers] pvfs2 tests Hello all, how can I compile the pvfs2 tests? There is no Makefile there, and the default Makefile does not seem to have an option for building these tests. I basically want to run my own tests, but I don't know which libraries to include... For example it says : undefined reference to `PVFS_BYTE' so I maybe need to compile together my test with the src/io/description/pvfs-request.c where the PVFS_BYTE is defined. Many thanks, Dimos. _______________________________________________ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers From dimstamat at gmail.com Wed Feb 15 17:09:49 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Wed Feb 15 17:09:38 2012 Subject: [Pvfs2-developers] pvfs2 tests In-Reply-To: <042301cceb95$987d6170$c9782450$@com> References: <042301cceb95$987d6170$c9782450$@com> Message-ID: Hello and thanks for your answer. I noticed the configure script and the Makefile was produced. But I get an error: debian:~/pvfs2.8.1-original/test# make LD client/sysint/io-test-offset collect2: ld returned 1 exit status make: *** [client/sysint/io-test-offset] Error 1 All the previous steps was completed successfully. Many thanks, Dimos. On Wed, Feb 15, 2012 at 5:55 AM, Elaine Quarles wrote: > In the test directory, run ./configure to generate the Makefile for the > tests. > > Hope that helps, > Elaine > > -----Original Message----- > From: pvfs2-developers-bounces@beowulf-underground.org > [mailto:pvfs2-developers-bounces@beowulf-underground.org] On Behalf Of > Dimos > Stamatakis > Sent: Tuesday, February 14, 2012 9:03 PM > To: pvfs2-developers@beowulf-underground.org > Subject: [Pvfs2-developers] pvfs2 tests > > Hello all, > how can I compile the pvfs2 tests? There is no Makefile there, and the > default Makefile does not seem to have an option for building these > tests. > I basically want to run my own tests, but I don't know which libraries > to include... For example it says : undefined reference to `PVFS_BYTE' > so I maybe need to compile together my test with the > src/io/description/pvfs-request.c where the PVFS_BYTE is defined. > > Many thanks, > Dimos. > _______________________________________________ > Pvfs2-developers mailing list > Pvfs2-developers@beowulf-underground.org > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/78d00cb5/attachment.htm From dimstamat at gmail.com Wed Feb 15 20:21:32 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Wed Feb 15 20:21:23 2012 Subject: [Pvfs2-developers] pvfs2 tests In-Reply-To: References: <042301cceb95$987d6170$c9782450$@com> Message-ID: Hello I found the reason. My VM run out of space! :) Thanks, Dimos. On Thu, Feb 16, 2012 at 12:09 AM, Dimos Stamatakis wrote: > Hello and thanks for your answer. > I noticed the configure script and the Makefile was produced. > But I get an error: > > debian:~/pvfs2.8.1-original/test# make > LD client/sysint/io-test-offset > collect2: ld returned 1 exit status > make: *** [client/sysint/io-test-offset] Error 1 > > All the previous steps was completed successfully. > > Many thanks, > Dimos. > > > > On Wed, Feb 15, 2012 at 5:55 AM, Elaine Quarles wrote: > >> In the test directory, run ./configure to generate the Makefile for the >> tests. >> >> Hope that helps, >> Elaine >> >> -----Original Message----- >> From: pvfs2-developers-bounces@beowulf-underground.org >> [mailto:pvfs2-developers-bounces@beowulf-underground.org] On Behalf Of >> Dimos >> Stamatakis >> Sent: Tuesday, February 14, 2012 9:03 PM >> To: pvfs2-developers@beowulf-underground.org >> Subject: [Pvfs2-developers] pvfs2 tests >> >> Hello all, >> how can I compile the pvfs2 tests? There is no Makefile there, and the >> default Makefile does not seem to have an option for building these >> tests. >> I basically want to run my own tests, but I don't know which libraries >> to include... For example it says : undefined reference to `PVFS_BYTE' >> so I maybe need to compile together my test with the >> src/io/description/pvfs-request.c where the PVFS_BYTE is defined. >> >> Many thanks, >> Dimos. >> _______________________________________________ >> Pvfs2-developers mailing list >> Pvfs2-developers@beowulf-underground.org >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/c9f0686f/attachment.htm From dimstamat at gmail.com Wed Feb 15 21:08:19 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Wed Feb 15 21:08:08 2012 Subject: [Pvfs2-developers] pvfs2 tests In-Reply-To: References: <042301cceb95$987d6170$c9782450$@com> Message-ID: Hello again! How can I make my own tests? It's hard for me to find all the dependencies... So is there a way to use the existing makefile and force it to compile my own test files as well? What is the easiest way to have my own tests? Many thanks, Dimos. On Thu, Feb 16, 2012 at 3:21 AM, Dimos Stamatakis wrote: > Hello I found the reason. My VM run out of space! :) > > Thanks, > Dimos. > > > On Thu, Feb 16, 2012 at 12:09 AM, Dimos Stamatakis wrote: > >> Hello and thanks for your answer. >> I noticed the configure script and the Makefile was produced. >> But I get an error: >> >> debian:~/pvfs2.8.1-original/test# make >> LD client/sysint/io-test-offset >> collect2: ld returned 1 exit status >> make: *** [client/sysint/io-test-offset] Error 1 >> >> All the previous steps was completed successfully. >> >> Many thanks, >> Dimos. >> >> >> >> On Wed, Feb 15, 2012 at 5:55 AM, Elaine Quarles wrote: >> >>> In the test directory, run ./configure to generate the Makefile for the >>> tests. >>> >>> Hope that helps, >>> Elaine >>> >>> -----Original Message----- >>> From: pvfs2-developers-bounces@beowulf-underground.org >>> [mailto:pvfs2-developers-bounces@beowulf-underground.org] On Behalf Of >>> Dimos >>> Stamatakis >>> Sent: Tuesday, February 14, 2012 9:03 PM >>> To: pvfs2-developers@beowulf-underground.org >>> Subject: [Pvfs2-developers] pvfs2 tests >>> >>> Hello all, >>> how can I compile the pvfs2 tests? There is no Makefile there, and the >>> default Makefile does not seem to have an option for building these >>> tests. >>> I basically want to run my own tests, but I don't know which libraries >>> to include... For example it says : undefined reference to `PVFS_BYTE' >>> so I maybe need to compile together my test with the >>> src/io/description/pvfs-request.c where the PVFS_BYTE is defined. >>> >>> Many thanks, >>> Dimos. >>> _______________________________________________ >>> Pvfs2-developers mailing list >>> Pvfs2-developers@beowulf-underground.org >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/40b11304/attachment.htm From dimstamat at gmail.com Thu Feb 16 08:25:17 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Thu Feb 16 08:25:07 2012 Subject: [Pvfs2-developers] making my own tests Message-ID: Hello all, I would like to ask you what is the easiest way to have my own tests. Shall I put them in the test directory with some modifications in the Makefile?? What's your opinion about that? Many thanks, Dimos. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/22dfe91d/attachment.htm From ligon at omnibond.com Thu Feb 16 09:26:31 2012 From: ligon at omnibond.com (Becky Ligon) Date: Thu Feb 16 09:26:21 2012 Subject: [Pvfs2-developers] making my own tests In-Reply-To: References: Message-ID: Dimos: There are many ways to create your own tests. One suggestion is to put your test program in the /src/apps/admin directory and update the module.mk file (in that same directory) to include your program. Then, you can use the Makefile in the to compile your program. From the , issue "make install" and your new program will be compiled and put in the bin directory of your installation directory (where the pvfs2-xxx programs are located). Becky On Thu, Feb 16, 2012 at 8:25 AM, Dimos Stamatakis wrote: > Hello all, > I would like to ask you what is the easiest way to have my own tests. > Shall I put them in the test directory with some modifications in the > Makefile?? What's your opinion about that? > > Many thanks, > Dimos. > > _______________________________________________ > Pvfs2-developers mailing list > Pvfs2-developers@beowulf-underground.org > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/ec67f78a/attachment.htm From dimstamat at gmail.com Thu Feb 16 09:38:04 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Thu Feb 16 09:37:54 2012 Subject: [Pvfs2-developers] making my own tests In-Reply-To: References: Message-ID: Hello and thanks for your answer! I just figured out the solution before a while, since I observed the way that the existing tests work. Many thanks anyway, Dimos. On Thu, Feb 16, 2012 at 4:26 PM, Becky Ligon wrote: > Dimos: > > There are many ways to create your own tests. One suggestion is to put > your test program in the /src/apps/admin directory and > update the module.mk file (in that same directory) to include your > program. Then, you can use the Makefile in the to compile > your program. From the , issue "make install" and your new > program will be compiled and put in the bin directory of your installation > directory (where the pvfs2-xxx programs are located). > > Becky > > > On Thu, Feb 16, 2012 at 8:25 AM, Dimos Stamatakis wrote: > >> Hello all, >> I would like to ask you what is the easiest way to have my own tests. >> Shall I put them in the test directory with some modifications in the >> Makefile?? What's your opinion about that? >> >> Many thanks, >> Dimos. >> >> _______________________________________________ >> Pvfs2-developers mailing list >> Pvfs2-developers@beowulf-underground.org >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >> >> > > > -- > Becky Ligon > OrangeFS Support and Development > Omnibond Systems > Anderson, South Carolina > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/f801e6a4/attachment-0001.htm From ligon at omnibond.com Thu Feb 16 09:40:54 2012 From: ligon at omnibond.com (Becky Ligon) Date: Thu Feb 16 09:40:45 2012 Subject: [Pvfs2-developers] making my own tests In-Reply-To: References: Message-ID: Fantastic! Good luck with your tests! Becky On Thu, Feb 16, 2012 at 9:38 AM, Dimos Stamatakis wrote: > Hello and thanks for your answer! > I just figured out the solution before a while, since I observed the way > that the existing tests work. > Many thanks anyway, > Dimos. > > > > On Thu, Feb 16, 2012 at 4:26 PM, Becky Ligon wrote: > >> Dimos: >> >> There are many ways to create your own tests. One suggestion is to put >> your test program in the /src/apps/admin directory and >> update the module.mk file (in that same directory) to include your >> program. Then, you can use the Makefile in the to compile >> your program. From the , issue "make install" and your new >> program will be compiled and put in the bin directory of your installation >> directory (where the pvfs2-xxx programs are located). >> >> Becky >> >> >> On Thu, Feb 16, 2012 at 8:25 AM, Dimos Stamatakis wrote: >> >>> Hello all, >>> I would like to ask you what is the easiest way to have my own tests. >>> Shall I put them in the test directory with some modifications in the >>> Makefile?? What's your opinion about that? >>> >>> Many thanks, >>> Dimos. >>> >>> _______________________________________________ >>> Pvfs2-developers mailing list >>> Pvfs2-developers@beowulf-underground.org >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >>> >>> >> >> >> -- >> Becky Ligon >> OrangeFS Support and Development >> Omnibond Systems >> Anderson, South Carolina >> >> >> > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/6e915ebe/attachment.htm From ligon at omnibond.com Thu Feb 16 09:54:19 2012 From: ligon at omnibond.com (Becky Ligon) Date: Thu Feb 16 09:54:09 2012 Subject: [Pvfs2-developers] making my own tests In-Reply-To: References: Message-ID: Cool! This solution is what I do for development on OrangeFS! Becky On Thu, Feb 16, 2012 at 9:48 AM, Dimos Stamatakis wrote: > By the way your solution is much better and easier . (I figured out > editing the test configure file and adding my own tests). > So I'm going to use your solution. > > Many thanks again, > Dimos. > > > On Thu, Feb 16, 2012 at 4:40 PM, Becky Ligon wrote: > >> Fantastic! Good luck with your tests! >> >> Becky >> >> >> On Thu, Feb 16, 2012 at 9:38 AM, Dimos Stamatakis wrote: >> >>> Hello and thanks for your answer! >>> I just figured out the solution before a while, since I observed the way >>> that the existing tests work. >>> Many thanks anyway, >>> Dimos. >>> >>> >>> >>> On Thu, Feb 16, 2012 at 4:26 PM, Becky Ligon wrote: >>> >>>> Dimos: >>>> >>>> There are many ways to create your own tests. One suggestion is to put >>>> your test program in the /src/apps/admin directory and >>>> update the module.mk file (in that same directory) to include your >>>> program. Then, you can use the Makefile in the to compile >>>> your program. From the , issue "make install" and your new >>>> program will be compiled and put in the bin directory of your installation >>>> directory (where the pvfs2-xxx programs are located). >>>> >>>> Becky >>>> >>>> >>>> On Thu, Feb 16, 2012 at 8:25 AM, Dimos Stamatakis wrote: >>>> >>>>> Hello all, >>>>> I would like to ask you what is the easiest way to have my own tests. >>>>> Shall I put them in the test directory with some modifications in the >>>>> Makefile?? What's your opinion about that? >>>>> >>>>> Many thanks, >>>>> Dimos. >>>>> >>>>> _______________________________________________ >>>>> Pvfs2-developers mailing list >>>>> Pvfs2-developers@beowulf-underground.org >>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >>>>> >>>>> >>>> >>>> >>>> -- >>>> Becky Ligon >>>> OrangeFS Support and Development >>>> Omnibond Systems >>>> Anderson, South Carolina >>>> >>>> >>>> >>> >> >> >> -- >> Becky Ligon >> OrangeFS Support and Development >> Omnibond Systems >> Anderson, South Carolina >> >> >> > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/be58c612/attachment.htm From bircoph at gmail.com Thu Feb 16 12:52:19 2012 From: bircoph at gmail.com (Andrew Savchenko) Date: Thu Feb 16 12:52:37 2012 Subject: [Pvfs2-developers] [PATCH] [BUG] Fix build of a static server for orangefs-2.8.5 Message-ID: <20120216215219.fa848002.bircoph@gmail.com> Skipped content of type multipart/mixed-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/b5fbbd19/attachment.bin From ligon at omnibond.com Thu Feb 16 14:22:28 2012 From: ligon at omnibond.com (Becky Ligon) Date: Thu Feb 16 14:22:19 2012 Subject: [Pvfs2-developers] [PATCH] [BUG] Fix build of a static server for orangefs-2.8.5 In-Reply-To: <20120216215219.fa848002.bircoph@gmail.com> References: <20120216215219.fa848002.bircoph@gmail.com> Message-ID: Thanks, Andrew! We will add the patch to our testing list. Becky Ligon On Thu, Feb 16, 2012 at 12:52 PM, Andrew Savchenko wrote: > Hello, > > Why testing various build options of orangefs-2.8.5 I found that > orangefs fails to build when configured with --enable-static-server. > I use Gentoo with vanilla kernel 3.0.17 and glibc-2.13 > > It fails as follows: > > gcc src/server/pvfs2-server-server.o src/server/pvfs2-server-req-server.o > lib/libpvfs2-server.a -o src/server/pvfs2-server > -L/home/andrew/src/orangefs-2.8.5/lib -static -lpvfs2-server -lcrypto > -lssl -ldb -lpthread -lrt > src/server/pvfs2-server-server.o: In function `PINT_server_access_debug': > /home/andrew/src/orangefs-2.8.5/src/server/pvfs2-server.c:2301: warning: > Using 'getgrgid' in statically linked applications requires at runtime the > shared libraries from the glibc version used for linking > lib/libpvfs2-server.a(pint-util-server.o): In function `PINT_check_group': > /home/andrew/src/orangefs-2.8.5/src/common/misc/pint-util.c:890: warning: > Using 'getgrgid_r' in statically linked applications requires at runtime > the shared libraries from the glibc version used for linking > src/server/pvfs2-server-server.o: In function `PINT_server_access_debug': > /home/andrew/src/orangefs-2.8.5/src/server/pvfs2-server.c:2300: warning: > Using 'getpwuid' in statically linked applications requires at runtime the > shared libraries from the glibc version used for linking > lib/libpvfs2-server.a(pint-util-server.o): In function `PINT_check_group': > /home/andrew/src/orangefs-2.8.5/src/common/misc/pint-util.c:872: warning: > Using 'getpwuid_r' in statically linked applications requires at runtime > the shared libraries from the glibc version used for linking > /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../lib64/libdb.a(os_addrinfo.o): > In function `__os_getaddrinfo': > (.text+0x2d): warning: Using 'getaddrinfo' in statically linked > applications requires at runtime the shared libraries from the glibc > version used for linking > lib/libpvfs2-server.a(bmi-tcp-server.o): In function > `BMI_tcp_addr_rev_lookup_unexpected': > /home/andrew/src/orangefs-2.8.5/src/io/bmi/bmi_tcp/bmi-tcp.c:1842: > warning: Using 'gethostbyaddr' in statically linked applications requires > at runtime the shared libraries from the glibc version used for linking > lib/libpvfs2-server.a(sockio-server.o): In function `BMI_sockio_init_sock': > /home/andrew/src/orangefs-2.8.5/src/io/bmi/bmi_tcp/sockio.c:140: warning: > Using 'gethostbyname' in statically linked applications requires at runtime > the shared libraries from the glibc version used for linking > /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../lib64/librt.a(aio_misc.o): > In function `handle_fildes_io': > (.text+0x165): undefined reference to `pthread_getschedparam' > /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../lib64/librt.a(aio_misc.o): > In function `handle_fildes_io': > (.text+0x1ae): undefined reference to `pthread_setschedparam' > /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../lib64/librt.a(aio_misc.o): > In function `__aio_enqueue_request': > (.text+0x8ee): undefined reference to `pthread_getschedparam' > collect2: ld returned 1 exit status > make: *** [src/server/pvfs2-server] Error 1 > > Problem is in -pthread -lrt sequence. rt may use pthread, thus pthread > should be added later. Proposed patch fixes this. > > Best regards, > Andrew Savchenko > > _______________________________________________ > Pvfs2-developers mailing list > Pvfs2-developers@beowulf-underground.org > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120216/98acc7e6/attachment.htm From bircoph at gmail.com Sat Feb 18 07:07:49 2012 From: bircoph at gmail.com (Andrew Savchenko) Date: Sat Feb 18 07:08:00 2012 Subject: [Pvfs2-developers] [PATCH] Fix pvfs2fuse installation in a sandboxed environment Message-ID: <20120218160749.c3b56fb5.bircoph@gmail.com> Skipped content of type multipart/mixed-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120218/c1006b0a/attachment.bin From dimstamat at gmail.com Sun Feb 19 19:17:01 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Sun Feb 19 19:16:51 2012 Subject: [Pvfs2-developers] no space left on device when running on amazon ec2 Message-ID: Hello! I have successfully run a pvfs installation on a eucalyptus cloud, but when I moved to Amazon EC2, I get a very strange error. When I run a metadata server it says: [S 02/20 00:04] PVFS2 Server ready. and then it says: [E 02/20 00:04] batch_create request got: No space left on device ....... And this error repeats ...... I checked all of my devices and there is plenty of space, so I don't think there is not enough space left... Can you explain that? What is the batch_create function? And where is it trying to write? Here is the output of the df -h on the data node: Filesystem Size Used Avail Use% Mounted on /dev/sda1 9.9G 2.7G 6.8G 29% / tmpfs 308M 0 308M 0% /lib/init/rw udev 10M 108K 9.9M 2% /dev tmpfs 308M 4.0K 308M 1% /dev/shm and on the meta data node: Filesystem Size Used Avail Use% Mounted on /dev/sda1 9.9G 2.0G 7.5G 21% / tmpfs 308M 0 308M 0% /lib/init/rw udev 10M 108K 9.9M 2% /dev tmpfs 308M 4.0K 308M 1% /dev/shm Many thanks, Dimos. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/e5087dbc/attachment.htm From dimstamat at gmail.com Sun Feb 19 20:49:50 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Sun Feb 19 20:49:40 2012 Subject: [Pvfs2-developers] Re: no space left on device when running on amazon ec2 In-Reply-To: References: Message-ID: I forgot to tell you that I use ec2-associate-address commands to tell the new master to grab the elastic IP address. If I do not use replication and I use the normal IP addresses it works fine! Is there a way to have High availability to amazon EC2 without use of elastic IPs?? Many thanks, Dimos. On Mon, Feb 20, 2012 at 2:17 AM, Dimos Stamatakis wrote: > Hello! > I have successfully run a pvfs installation on a eucalyptus cloud, but > when I moved to Amazon EC2, I get a very strange error. > When I run a metadata server it says: > > [S 02/20 00:04] PVFS2 Server ready. > > and then it says: > > [E 02/20 00:04] batch_create request got: No space left on device > ....... And this error repeats ...... > > I checked all of my devices and there is plenty of space, so I don't think > there is not enough space left... > Can you explain that? > What is the batch_create function? And where is it trying to write? > > Here is the output of the df -h on the data node: > > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 9.9G 2.7G 6.8G 29% / > tmpfs 308M 0 308M 0% /lib/init/rw > udev 10M 108K 9.9M 2% /dev > tmpfs 308M 4.0K 308M 1% /dev/shm > > and on the meta data node: > > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 9.9G 2.0G 7.5G 21% / > tmpfs 308M 0 308M 0% /lib/init/rw > udev 10M 108K 9.9M 2% /dev > tmpfs 308M 4.0K 308M 1% /dev/shm > > Many thanks, > Dimos. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/ffdfe9c8/attachment.htm From dimstamat at gmail.com Sun Feb 19 22:48:30 2012 From: dimstamat at gmail.com (Dimos Stamatakis) Date: Sun Feb 19 22:48:18 2012 Subject: [Pvfs2-developers] Re: no space left on device when running on amazon ec2 In-Reply-To: References: Message-ID: Hello again! I want to help you realize what is going wrong by telling you that the client blocks at a pvfs2-ls (it does not say connection refused immediatelly). It can also ping the new elastic IP normally! I redirected the metadata server output to a file and when I checked it i didn't find anything wrong... It did all the gets - puts that are happening everytime the DB is created. Here is the output tail: get (handle: 4611686018427387903)()(key_sz:8) -> (511)(4) put (handle: 4611686018427387903)()(key_sz:8) -> (512)(4) [1329709072:419164][4413/140213703595776] TROVE:DBPF:Berkeley DB: bulk_msg: Send buffer after copy due to PERM [1329709072:419173][4413/140213703595776] TROVE:DBPF:Berkeley DB: send_bulk: Send 160 (0xa0) bulk buffer bytes [1329709072:419183][4413/140213703595776] TROVE:DBPF:Berkeley DB: //pvfs2-storage-space/27c41225/ rep_send_message: msgv = 7 logv 19 gen = 1 eid -1, type bulk_log, LSN [1][217660] perm [1329709072:419193][4413/140213703595776] TROVE:DBPF:Berkeley DB: rep_send_function returned: -30975 How can I find out why this metadata server refuses serving the client requests? Thanks again, Dimos. On Mon, Feb 20, 2012 at 3:49 AM, Dimos Stamatakis wrote: > I forgot to tell you that I use ec2-associate-address commands to tell the > new master to grab the elastic IP address. If I do not use replication and > I use the normal IP addresses it works fine! > Is there a way to have High availability to amazon EC2 without use of > elastic IPs?? > > Many thanks, > Dimos. > > > > On Mon, Feb 20, 2012 at 2:17 AM, Dimos Stamatakis wrote: > >> Hello! >> I have successfully run a pvfs installation on a eucalyptus cloud, but >> when I moved to Amazon EC2, I get a very strange error. >> When I run a metadata server it says: >> >> [S 02/20 00:04] PVFS2 Server ready. >> >> and then it says: >> >> [E 02/20 00:04] batch_create request got: No space left on device >> ....... And this error repeats ...... >> >> I checked all of my devices and there is plenty of space, so I don't >> think there is not enough space left... >> Can you explain that? >> What is the batch_create function? And where is it trying to write? >> >> Here is the output of the df -h on the data node: >> >> Filesystem Size Used Avail Use% Mounted on >> /dev/sda1 9.9G 2.7G 6.8G 29% / >> tmpfs 308M 0 308M 0% /lib/init/rw >> udev 10M 108K 9.9M 2% /dev >> tmpfs 308M 4.0K 308M 1% /dev/shm >> >> and on the meta data node: >> >> Filesystem Size Used Avail Use% Mounted on >> /dev/sda1 9.9G 2.0G 7.5G 21% / >> tmpfs 308M 0 308M 0% /lib/init/rw >> udev 10M 108K 9.9M 2% /dev >> tmpfs 308M 4.0K 308M 1% /dev/shm >> >> Many thanks, >> Dimos. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/8f30dcd1/attachment.htm From ligon at omnibond.com Mon Feb 20 10:42:21 2012 From: ligon at omnibond.com (Becky Ligon) Date: Mon Feb 20 10:42:11 2012 Subject: [Pvfs2-developers] Re: no space left on device when running on amazon ec2 In-Reply-To: References: Message-ID: When a server is started, it sends a batch_create request to every other server in the filesystem. The batch_create request asks the receiving server to send back a list of unused data handles (owned by that particular server). For those handles in the list, the receiving server sets an attribute in the local database to indicate that the handle is in use. You may see a batch_create request after your servers have been running for a while, since a server will request another batch of handles if its current store gets low. This entire process is a performance enhancement, which allows a file's data handles to be assigned by the metadata server without contacting the data handle servers, thus reducing the time it takes to create a file. With all of that said, it seems that one of your servers is having trouble accessing the database or communicating with another server. I'm not exactly sure without further research. Think about the above description and see if you can't pinpoint which server is causing the trouble. Becky On Sun, Feb 19, 2012 at 10:48 PM, Dimos Stamatakis wrote: > Hello again! > I want to help you realize what is going wrong by telling you that the > client blocks at a pvfs2-ls (it does not say connection refused > immediatelly). It can also ping the new elastic IP normally! > > I redirected the metadata server output to a file and when I checked it i > didn't find anything wrong... It did all the gets - puts that are happening > everytime the DB is created. Here is the output tail: > > get (handle: 4611686018427387903)()(key_sz:8) -> (511)(4) > put (handle: 4611686018427387903)()(key_sz:8) -> (512)(4) > [1329709072:419164][4413/140213703595776] TROVE:DBPF:Berkeley DB: > bulk_msg: Send buffer after copy due to PERM > [1329709072:419173][4413/140213703595776] TROVE:DBPF:Berkeley DB: > send_bulk: Send 160 (0xa0) bulk buffer bytes > [1329709072:419183][4413/140213703595776] TROVE:DBPF:Berkeley DB: > //pvfs2-storage-space/27c41225/ rep_send_message: msgv = 7 logv 19 gen = 1 > eid -1, type bulk_log, LSN [1][217660] perm > [1329709072:419193][4413/140213703595776] TROVE:DBPF:Berkeley DB: > rep_send_function returned: -30975 > > How can I find out why this metadata server refuses serving the client > requests? > > Thanks again, > Dimos. > > > > On Mon, Feb 20, 2012 at 3:49 AM, Dimos Stamatakis wrote: > >> I forgot to tell you that I use ec2-associate-address commands to tell >> the new master to grab the elastic IP address. If I do not use replication >> and I use the normal IP addresses it works fine! >> Is there a way to have High availability to amazon EC2 without use of >> elastic IPs?? >> >> Many thanks, >> Dimos. >> >> >> >> On Mon, Feb 20, 2012 at 2:17 AM, Dimos Stamatakis wrote: >> >>> Hello! >>> I have successfully run a pvfs installation on a eucalyptus cloud, but >>> when I moved to Amazon EC2, I get a very strange error. >>> When I run a metadata server it says: >>> >>> [S 02/20 00:04] PVFS2 Server ready. >>> >>> and then it says: >>> >>> [E 02/20 00:04] batch_create request got: No space left on device >>> ....... And this error repeats ...... >>> >>> I checked all of my devices and there is plenty of space, so I don't >>> think there is not enough space left... >>> Can you explain that? >>> What is the batch_create function? And where is it trying to write? >>> >>> Here is the output of the df -h on the data node: >>> >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/sda1 9.9G 2.7G 6.8G 29% / >>> tmpfs 308M 0 308M 0% /lib/init/rw >>> udev 10M 108K 9.9M 2% /dev >>> tmpfs 308M 4.0K 308M 1% /dev/shm >>> >>> and on the meta data node: >>> >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/sda1 9.9G 2.0G 7.5G 21% / >>> tmpfs 308M 0 308M 0% /lib/init/rw >>> udev 10M 108K 9.9M 2% /dev >>> tmpfs 308M 4.0K 308M 1% /dev/shm >>> >>> Many thanks, >>> Dimos. >>> >> >> > > _______________________________________________ > Pvfs2-developers mailing list > Pvfs2-developers@beowulf-underground.org > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/06b36737/attachment-0001.htm From dimstamat at gmail.com Mon Feb 20 12:27:10 2012 From: dimstamat at gmail.com (Dimokritos Stamatakis) Date: Mon Feb 20 12:28:12 2012 Subject: [Pvfs2-developers] Re: Pvfs2-developers Digest, Vol 72, Issue 8 In-Reply-To: <201202201542.q1KFgEGj003079@parlweb.parl.clemson.edu> References: <201202201542.q1KFgEGj003079@parlweb.parl.clemson.edu> Message-ID: <4696A8A4-5745-449F-80BE-8D65F36AC29A@gmail.com> Hello and thanks for your answer! The problem was the amazon ec2-disassociate command, where the public IP became unreachable. I now invoke just the ec2-associate and everything is fine!! Thanks to all for trying to help, Dimos. On 20 ??? 2012, at 5:42 ?.?., pvfs2-developers-request@beowulf-underground.org wrote: > Send Pvfs2-developers mailing list submissions to > pvfs2-developers@beowulf-underground.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > or, via email, send a message with subject or body 'help' to > pvfs2-developers-request@beowulf-underground.org > > You can reach the person managing the list at > pvfs2-developers-owner@beowulf-underground.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Pvfs2-developers digest..." > > > Today's Topics: > > 1. no space left on device when running on amazon ec2 > (Dimos Stamatakis) > 2. Re: no space left on device when running on amazon ec2 > (Dimos Stamatakis) > 3. Re: no space left on device when running on amazon ec2 > (Dimos Stamatakis) > 4. Re: Re: no space left on device when running on amazon ec2 > (Becky Ligon) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 20 Feb 2012 02:17:01 +0200 > From: Dimos Stamatakis > Subject: [Pvfs2-developers] no space left on device when running on > amazon ec2 > To: pvfs2-developers@beowulf-underground.org > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Hello! > I have successfully run a pvfs installation on a eucalyptus cloud, but when > I moved to Amazon EC2, I get a very strange error. > When I run a metadata server it says: > > [S 02/20 00:04] PVFS2 Server ready. > > and then it says: > > [E 02/20 00:04] batch_create request got: No space left on device > ....... And this error repeats ...... > > I checked all of my devices and there is plenty of space, so I don't think > there is not enough space left... > Can you explain that? > What is the batch_create function? And where is it trying to write? > > Here is the output of the df -h on the data node: > > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 9.9G 2.7G 6.8G 29% / > tmpfs 308M 0 308M 0% /lib/init/rw > udev 10M 108K 9.9M 2% /dev > tmpfs 308M 4.0K 308M 1% /dev/shm > > and on the meta data node: > > Filesystem Size Used Avail Use% Mounted on > /dev/sda1 9.9G 2.0G 7.5G 21% / > tmpfs 308M 0 308M 0% /lib/init/rw > udev 10M 108K 9.9M 2% /dev > tmpfs 308M 4.0K 308M 1% /dev/shm > > Many thanks, > Dimos. > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/e5087dbc/attachment-0001.htm > > ------------------------------ > > Message: 2 > Date: Mon, 20 Feb 2012 03:49:50 +0200 > From: Dimos Stamatakis > Subject: [Pvfs2-developers] Re: no space left on device when running > on amazon ec2 > To: pvfs2-developers@beowulf-underground.org > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > I forgot to tell you that I use ec2-associate-address commands to tell the > new master to grab the elastic IP address. If I do not use replication and > I use the normal IP addresses it works fine! > Is there a way to have High availability to amazon EC2 without use of > elastic IPs?? > > Many thanks, > Dimos. > > > On Mon, Feb 20, 2012 at 2:17 AM, Dimos Stamatakis wrote: > >> Hello! >> I have successfully run a pvfs installation on a eucalyptus cloud, but >> when I moved to Amazon EC2, I get a very strange error. >> When I run a metadata server it says: >> >> [S 02/20 00:04] PVFS2 Server ready. >> >> and then it says: >> >> [E 02/20 00:04] batch_create request got: No space left on device >> ....... And this error repeats ...... >> >> I checked all of my devices and there is plenty of space, so I don't think >> there is not enough space left... >> Can you explain that? >> What is the batch_create function? And where is it trying to write? >> >> Here is the output of the df -h on the data node: >> >> Filesystem Size Used Avail Use% Mounted on >> /dev/sda1 9.9G 2.7G 6.8G 29% / >> tmpfs 308M 0 308M 0% /lib/init/rw >> udev 10M 108K 9.9M 2% /dev >> tmpfs 308M 4.0K 308M 1% /dev/shm >> >> and on the meta data node: >> >> Filesystem Size Used Avail Use% Mounted on >> /dev/sda1 9.9G 2.0G 7.5G 21% / >> tmpfs 308M 0 308M 0% /lib/init/rw >> udev 10M 108K 9.9M 2% /dev >> tmpfs 308M 4.0K 308M 1% /dev/shm >> >> Many thanks, >> Dimos. >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/ffdfe9c8/attachment-0001.htm > > ------------------------------ > > Message: 3 > Date: Mon, 20 Feb 2012 05:48:30 +0200 > From: Dimos Stamatakis > Subject: [Pvfs2-developers] Re: no space left on device when running > on amazon ec2 > To: pvfs2-developers@beowulf-underground.org > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Hello again! > I want to help you realize what is going wrong by telling you that the > client blocks at a pvfs2-ls (it does not say connection refused > immediatelly). It can also ping the new elastic IP normally! > > I redirected the metadata server output to a file and when I checked it i > didn't find anything wrong... It did all the gets - puts that are happening > everytime the DB is created. Here is the output tail: > > get (handle: 4611686018427387903)()(key_sz:8) -> (511)(4) > put (handle: 4611686018427387903)()(key_sz:8) -> (512)(4) > [1329709072:419164][4413/140213703595776] TROVE:DBPF:Berkeley DB: bulk_msg: > Send buffer after copy due to PERM > [1329709072:419173][4413/140213703595776] TROVE:DBPF:Berkeley DB: > send_bulk: Send 160 (0xa0) bulk buffer bytes > [1329709072:419183][4413/140213703595776] TROVE:DBPF:Berkeley DB: > //pvfs2-storage-space/27c41225/ rep_send_message: msgv = 7 logv 19 gen = 1 > eid -1, type bulk_log, LSN [1][217660] perm > [1329709072:419193][4413/140213703595776] TROVE:DBPF:Berkeley DB: > rep_send_function returned: -30975 > > How can I find out why this metadata server refuses serving the client > requests? > > Thanks again, > Dimos. > > > > On Mon, Feb 20, 2012 at 3:49 AM, Dimos Stamatakis wrote: > >> I forgot to tell you that I use ec2-associate-address commands to tell the >> new master to grab the elastic IP address. If I do not use replication and >> I use the normal IP addresses it works fine! >> Is there a way to have High availability to amazon EC2 without use of >> elastic IPs?? >> >> Many thanks, >> Dimos. >> >> >> >> On Mon, Feb 20, 2012 at 2:17 AM, Dimos Stamatakis wrote: >> >>> Hello! >>> I have successfully run a pvfs installation on a eucalyptus cloud, but >>> when I moved to Amazon EC2, I get a very strange error. >>> When I run a metadata server it says: >>> >>> [S 02/20 00:04] PVFS2 Server ready. >>> >>> and then it says: >>> >>> [E 02/20 00:04] batch_create request got: No space left on device >>> ....... And this error repeats ...... >>> >>> I checked all of my devices and there is plenty of space, so I don't >>> think there is not enough space left... >>> Can you explain that? >>> What is the batch_create function? And where is it trying to write? >>> >>> Here is the output of the df -h on the data node: >>> >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/sda1 9.9G 2.7G 6.8G 29% / >>> tmpfs 308M 0 308M 0% /lib/init/rw >>> udev 10M 108K 9.9M 2% /dev >>> tmpfs 308M 4.0K 308M 1% /dev/shm >>> >>> and on the meta data node: >>> >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/sda1 9.9G 2.0G 7.5G 21% / >>> tmpfs 308M 0 308M 0% /lib/init/rw >>> udev 10M 108K 9.9M 2% /dev >>> tmpfs 308M 4.0K 308M 1% /dev/shm >>> >>> Many thanks, >>> Dimos. >>> >> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/8f30dcd1/attachment-0001.htm > > ------------------------------ > > Message: 4 > Date: Mon, 20 Feb 2012 10:42:21 -0500 > From: Becky Ligon > Subject: Re: [Pvfs2-developers] Re: no space left on device when > running on amazon ec2 > To: Dimos Stamatakis > Cc: pvfs2-developers@beowulf-underground.org > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > When a server is started, it sends a batch_create request to every other > server in the filesystem. The batch_create request asks the receiving > server to send back a list of unused data handles (owned by that particular > server). For those handles in the list, the receiving server sets an > attribute in the local database to indicate that the handle is in use. You > may see a batch_create request after your servers have been running for a > while, since a server will request another batch of handles if its current > store gets low. This entire process is a performance enhancement, which > allows a file's data handles to be assigned by the metadata server without > contacting the data handle servers, thus reducing the time it takes to > create a file. > > With all of that said, it seems that one of your servers is having trouble > accessing the database or communicating with another server. I'm not > exactly sure without further research. Think about the above description > and see if you can't pinpoint which server is causing the trouble. > > Becky > > On Sun, Feb 19, 2012 at 10:48 PM, Dimos Stamatakis wrote: > >> Hello again! >> I want to help you realize what is going wrong by telling you that the >> client blocks at a pvfs2-ls (it does not say connection refused >> immediatelly). It can also ping the new elastic IP normally! >> >> I redirected the metadata server output to a file and when I checked it i >> didn't find anything wrong... It did all the gets - puts that are happening >> everytime the DB is created. Here is the output tail: >> >> get (handle: 4611686018427387903)()(key_sz:8) -> (511)(4) >> put (handle: 4611686018427387903)()(key_sz:8) -> (512)(4) >> [1329709072:419164][4413/140213703595776] TROVE:DBPF:Berkeley DB: >> bulk_msg: Send buffer after copy due to PERM >> [1329709072:419173][4413/140213703595776] TROVE:DBPF:Berkeley DB: >> send_bulk: Send 160 (0xa0) bulk buffer bytes >> [1329709072:419183][4413/140213703595776] TROVE:DBPF:Berkeley DB: >> //pvfs2-storage-space/27c41225/ rep_send_message: msgv = 7 logv 19 gen = 1 >> eid -1, type bulk_log, LSN [1][217660] perm >> [1329709072:419193][4413/140213703595776] TROVE:DBPF:Berkeley DB: >> rep_send_function returned: -30975 >> >> How can I find out why this metadata server refuses serving the client >> requests? >> >> Thanks again, >> Dimos. >> >> >> >> On Mon, Feb 20, 2012 at 3:49 AM, Dimos Stamatakis wrote: >> >>> I forgot to tell you that I use ec2-associate-address commands to tell >>> the new master to grab the elastic IP address. If I do not use replication >>> and I use the normal IP addresses it works fine! >>> Is there a way to have High availability to amazon EC2 without use of >>> elastic IPs?? >>> >>> Many thanks, >>> Dimos. >>> >>> >>> >>> On Mon, Feb 20, 2012 at 2:17 AM, Dimos Stamatakis wrote: >>> >>>> Hello! >>>> I have successfully run a pvfs installation on a eucalyptus cloud, but >>>> when I moved to Amazon EC2, I get a very strange error. >>>> When I run a metadata server it says: >>>> >>>> [S 02/20 00:04] PVFS2 Server ready. >>>> >>>> and then it says: >>>> >>>> [E 02/20 00:04] batch_create request got: No space left on device >>>> ....... And this error repeats ...... >>>> >>>> I checked all of my devices and there is plenty of space, so I don't >>>> think there is not enough space left... >>>> Can you explain that? >>>> What is the batch_create function? And where is it trying to write? >>>> >>>> Here is the output of the df -h on the data node: >>>> >>>> Filesystem Size Used Avail Use% Mounted on >>>> /dev/sda1 9.9G 2.7G 6.8G 29% / >>>> tmpfs 308M 0 308M 0% /lib/init/rw >>>> udev 10M 108K 9.9M 2% /dev >>>> tmpfs 308M 4.0K 308M 1% /dev/shm >>>> >>>> and on the meta data node: >>>> >>>> Filesystem Size Used Avail Use% Mounted on >>>> /dev/sda1 9.9G 2.0G 7.5G 21% / >>>> tmpfs 308M 0 308M 0% /lib/init/rw >>>> udev 10M 108K 9.9M 2% /dev >>>> tmpfs 308M 4.0K 308M 1% /dev/shm >>>> >>>> Many thanks, >>>> Dimos. >>>> >>> >>> >> >> _______________________________________________ >> Pvfs2-developers mailing list >> Pvfs2-developers@beowulf-underground.org >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >> >> > > > -- > Becky Ligon > OrangeFS Support and Development > Omnibond Systems > Anderson, South Carolina > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/06b36737/attachment.htm > > ------------------------------ > > _______________________________________________ > Pvfs2-developers mailing list > Pvfs2-developers@beowulf-underground.org > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > > End of Pvfs2-developers Digest, Vol 72, Issue 8 > *********************************************** From ligon at omnibond.com Mon Feb 20 22:16:44 2012 From: ligon at omnibond.com (Becky Ligon) Date: Mon Feb 20 22:16:34 2012 Subject: [Pvfs2-developers] Re: [Pvfs2-users] OrangeFS on nodes with different storage space capacities. In-Reply-To: <20120221013035.198bca6a.bircoph@gmail.com> References: <20120220211549.78041655.bircoph@gmail.com> <20120221013035.198bca6a.bircoph@gmail.com> Message-ID: Andrew: It is possible to use a different distribution (other than simple stripe) that will take advantage of servers with different space allocations. However, the code that performs this service hasn't been used in a very long time. My co-worker is going to look into it and see if this distribution is still viable. I'll get back to you when we know. Becky On Mon, Feb 20, 2012 at 4:30 PM, Andrew Savchenko wrote: > Hello, > > On Mon, 20 Feb 2012 13:47:45 -0500 Becky Ligon wrote: > > Andrew: > > > > Having two servers on the 2S machine should work fine, but it does depend > > on the storage backend and how it is connected to the server. If you are > > using local storage, then you are correct in that having two servers > > *could* be a performance hit; however, if you are using some sort of SAN, > > then that statement may not be valid. In our own environment, we had 4 > > servers running on one machine attached to a DDN device, and performance > > was good. How is your 2S storage attached? > > Each S is just a local SATA hard drive, though with good hardware > cache. Each node has 8 cores and 8 GB RAM. 2S node is a master node, > it has double in disks and half in CPU/RAM: 4/4. On top of that each > node will be used for computing (MPI/PBS), master node will be used > only partially. I'm aware, using the same hosts for both storage and > computing will hurt overall performance, but with current hardware it > is the best I can do. > > Best regards, > Andrew Savchenko > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120220/e302b273/attachment.htm From ligon at omnibond.com Tue Feb 21 17:44:10 2012 From: ligon at omnibond.com (Becky Ligon) Date: Tue Feb 21 17:44:01 2012 Subject: [Pvfs2-developers] Re: [Pvfs2-users] OrangeFS on nodes with different storage space capacities. In-Reply-To: References: <20120220211549.78041655.bircoph@gmail.com> <20120221013035.198bca6a.bircoph@gmail.com> Message-ID: Andrew: We have a distribution that should help you with your storage problem. Here is a link to a wiki that describes how to use it: http://www.orangefs.org/trac/orangefs/wiki/Distributions. In your case, I would add this distribution to your config file: Name varstrip_dist Param strips Value 0:1S; 1:1S; 2:2S; 3:1S where 1S=64K and 2S=128K (for example). You can set the distribution but, by default, the list of physical servers chosen for a logical file is still round-robin. So, for example, if file1 is given servers a1,a2,a3,a4, in that order, then server a3 will be given twice the data, because the distribution specifies that the third server in the list should have twice the data. The next file created may be given a2,a3,a4,a1, in that order, so server a4 will be given twice the data. Distributions are not directly tied to specific servers. To modify this round-robin behavior, you will have to change one line of code in src/client/sysint/sys-create.sm: In function PVFS_isys_create: *if*(layout ) 228 { 229 /* make sure it is a supported layout */ 230 *switch*(layout ->algorithm ) 231 { 232 /* these are valid */ 233 *case* PVFS_SYS_LAYOUT_ROUND_ROBIN : 234 *case* PVFS_SYS_LAYOUT_RANDOM : 235 *case* PVFS_SYS_LAYOUT_LIST : 236 *break*; 237 /* anything else is not */ 238 *default*: 239 *return*(-PVFS_EINVAL ); 240 } 241 242 sm_p ->u.create .layout .algorithm = layout ->algorithm ; 243 *if*(layout ->algorithm == PVFS_SYS_LAYOUT_LIST ) 244 { 245 sm_p ->u.create .layout .server_list .count = layout ->server_list .count ; 246 sm_p ->u.create .layout .server_list .servers = 247 malloc (layout ->server_list .count * sizeof (PVFS_BMI_addr_t )); *if*(layout ) 228 { 229 /* make sure it is a supported layout */ 230 *switch*(layout ->algorithm ) 231 { 232 /* these are valid */ 233 *case* PVFS_SYS_LAYOUT_ROUND_ROBIN : 234 *case* PVFS_SYS_LAYOUT_RANDOM : 235 *case* PVFS_SYS_LAYOUT_LIST : 236 *break*; 237 /* anything else is not */ 238 *default*: 239 *return*(-PVFS_EINVAL ); 240 } 241 242 sm_p ->u.create .layout .algorithm = layout ->algorithm ; 243 *if*(layout ->algorithm == PVFS_SYS_LAYOUT_LIST ) 244 { 245 sm_p ->u.create .layout .server_list .count = layout ->server_list .count ; 246 sm_p ->u.create .layout .server_list .servers = 247 malloc (layout ->server_list .count * sizeof (PVFS_BMI_addr_t )); 248 *if*(!sm_p ->u.create .layout .server_list .servers ) 249 { 250 *return* -PVFS_ENOMEM ; 251 } 252 memcpy (sm_p ->u.create .layout .server_list .servers , 253 layout ->server_list .servers , 254 layout ->server_list .count * sizeof (PVFS_BMI_addr_t )); 255 } 256 } 257 *else*server 0 will be given 1S of data, server 1 will be given 1S of data, server 2 will be given 2S of data, and server 3 will be given 1S of data. 258 { 259 sm_p ->u.create .layout .algorithm = PVFS_SYS_LAYOUT_ROUND_ROBIN ; 260 } 248 *if*(!sm_p ->u.create .layout .server_list .servers ) 249 { 250 *return* -PVFS_ENOMEM ; 251 } 252 memcpy (sm_p ->u.create .layout .server_list .servers , 253 layout ->server_list .servers , 254 layout ->server_list .count * sizeof (PVFS_BMI_addr_t )); 255 } 256 } 257 *else* /***************** change the default layout in the line below to PVFS_SYS_LAYOUT_NONE instead of PVFS_SYS_LAYOUT_ROUND_ROBIN *******************/ 258 { 259 sm_p ->u.create .layout .algorithm = PVFS_SYS_LAYOUT_ROUND_ROBIN ; 260 } By specifiying PVFS_SYS_LAYOUT_NONE, the system will ALWAYS choose server zero first, which corresponds to the first server that you defined in your config file, instead of the next round-robin server, then, server one, server two, and server three, again in order of the servers defined in your config file. Continuing with the above distribution example, whenever a file is created, the order of the servers will always be a1,a2,a3,a4. So, a3 will always be given twice the data as the other 3. Just be sure that a3 is the third server specified in the config file. Apparently, when this variable strip distribution was developed, all of the user API's weren't updated to allow easy use. Hopefully, we can change that! Let me know if you have any questions. We haven't actually tried this and would like to know if it is still working properly! Thanks, Becky On Mon, Feb 20, 2012 at 10:16 PM, Becky Ligon wrote: > Andrew: > > It is possible to use a different distribution (other than simple stripe) > that will take advantage of servers with different space allocations. > However, the code that performs this service hasn't been used in a very > long time. My co-worker is going to look into it and see if this > distribution is still viable. I'll get back to you when we know. > > Becky > > > On Mon, Feb 20, 2012 at 4:30 PM, Andrew Savchenko wrote: > >> Hello, >> >> On Mon, 20 Feb 2012 13:47:45 -0500 Becky Ligon wrote: >> > Andrew: >> > >> > Having two servers on the 2S machine should work fine, but it does >> depend >> > on the storage backend and how it is connected to the server. If you >> are >> > using local storage, then you are correct in that having two servers >> > *could* be a performance hit; however, if you are using some sort of >> SAN, >> > then that statement may not be valid. In our own environment, we had 4 >> > servers running on one machine attached to a DDN device, and performance >> > was good. How is your 2S storage attached? >> >> Each S is just a local SATA hard drive, though with good hardware >> cache. Each node has 8 cores and 8 GB RAM. 2S node is a master node, >> it has double in disks and half in CPU/RAM: 4/4. On top of that each >> node will be used for computing (MPI/PBS), master node will be used >> only partially. I'm aware, using the same hosts for both storage and >> computing will hurt overall performance, but with current hardware it >> is the best I can do. >> >> Best regards, >> Andrew Savchenko >> > > > > -- > Becky Ligon > OrangeFS Support and Development > Omnibond Systems > Anderson, South Carolina > > > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20120221/f945aeaa/attachment-0001.htm