[Pvfs2-users] PVFS2 over Infiniband error
Florin Isaila
florin.isaila at gmail.com
Mon Nov 26 17:55:15 EST 2007
Hi,
I checked "max locked memory" and it is set to unlimitted on both
machines (PVFS2 client and server):
max locked memory (kbytes, -l) unlimited
Infiniband fabric is:
lslogin2% lspci | grep Infi
0c:00.0 InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx
HCA] (rev a0)
Florin
On Nov 26, 2007 3:40 PM, Pete Wyckoff <pw at osc.edu> wrote:
> florin.isaila at gmail.com wrote on Fri, 16 Nov 2007 16:30 -0600:
> > I am coming back to a problem I still have with PVFS 2.6.3 over IB.
> >
> > I run it on Lonestar - Xeon Intel Duo-Core 64bit cluster at TACC:
> > http://www.tacc.utexas.edu/services/userguides/lonestar/
> >
> > I remind you that PVFS-IB works on the front end, but fails when I try
> > to start it on the compute nodes.
> >
> > As Pete suggested I had set the debug level to network.
> >
> > I found out that there for each run one of two types of errors show up:
> >
> > 1) this is from the previous message I sent to the list
> > > > [E 10:04:01.781047] Error: openib_mem_register: ibv_register_mr.
> >
> > 2) this I just got (the full messages are at the end of this mail):
> > [E 12:05:07.676399] Error: openib_ib_initialize: ibv_create_cq failed.
>
> This comes before the register_mr so let's tackle it first.
>
> > As Pete suggested I looked in /etc/security/limits.conf: soft and hard
> > memlock are set to unlimited.
>
> Nice to know, but just to be sure, sit on the machine where you are
> getting the error message, in bash, and do "ulimit -a" and tell us
> what "max locked memory" says. I bet it is 32. That would explain
> why the CQ fails: it tries to pin 1k elements of 32 bytes each.
>
> > In do not have control over the nodes, I can not install things, I am
> > just a user :)
>
> If this is true, complain to your admin. He probably forgot to do
> "ulimit -l unlimited" in the PBS mom startup script, if you are
> landing on the nodes thanks to "qsub -I". I wonder how anybody has
> been able to run any MPI/IB codes. If you are getting there via
> rsh or ssh, limits.conf should be doing the trick, but maybe there
> is some hokeyness it /etc/profile.d/* or similar. You will have to
> nose around.
>
> > Pete, how can I find out what type of Infiniband fabric is installed?
>
> lspci | grep Infi
>
> -- Pete
>
>
More information about the Pvfs2-users
mailing list