[Pvfs2-developers] segfault in openib code

Kyle Schochenmaier kschoche at scl.ameslab.gov
Fri Sep 15 10:58:56 EDT 2006


Pete -

I've been trying to debug some issues with my MD server going down, or 
rather timing out and closing the connections for some reason, and 
canceling bmi jobs.  While doing so, I ran into a segfaulting issue in 
openib_close_connection:

static void openib_close_connection(ib_connection_t *c)
{
    int ret;
    struct openib_connection_priv *oc = c->priv;

    /* destroy the queue pairs */

<snip>

    free(oc);
}

Since my gdb backtrace doesnt go into any ibv_* functions, I'm assuming 
this free() call is the culprit.
I'm not sure why this free() could be getting into a segfault, but I'm 
thinking it may be a good idea for now until we can work out why it's 
closing the connections, to put a check in there to make sure oc is 
still valid.

Has anyone run into this or other issues with servers going down in openib?

    -- Kyle


-- 
Kyle Schochenmaier
kschoche at scl.ameslab.gov
Research Assistant, Dr. Brett Bode
AmesLab - US Dept.Energy
Scalable Computing Laboratory 



More information about the Pvfs2-developers mailing list