[Pvfs2-developers] Re: double-free in BMI_ib_cancel

Pete Wyckoff pw at osc.edu
Tue Apr 8 14:43:48 EDT 2008


kschoche at gmail.com wrote on Tue, 08 Apr 2008 13:17 -0500:
> Troy and I stumbled across this bug, that at least for our
> configurations, causes a double-free on the server when cleaning up
> 'stale' connections.
> 
> It seems that some logic is duplicated here when cleaning up
> connections that disappear to the server:
> 
> #if !MEMCACHE_BOUNCEBUF
>             if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)  /*
> --------------- here-------------- */
>                 memcache_deregister(ib_device->memcache, &rq->buflist);
> #  if MEMCACHE_EARLY_REG
>             /* pin on post, dereg all these */
>             if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
>                 rq->state.recv == RQ_RTS_WAITING_RTS_DONE)    /*
> ----------------- and here ------------ */
>                 memcache_deregister(ib_device->memcache, &rq->buflist);
>             if (rq->state.recv == RQ_WAITING_INCOMING
>               && rq->buflist.tot_len > ib_device->eager_buf_payload)
>                 memcache_deregister(ib_device->memcache, &rq->buflist);
> #  endif
> 
> The patch below worked cleans it up, and compiles and runs *here*, but
> we have some weird hanging issues after closing connections, basically
> no new unexpected requests are processed until after the BMItimeout
> and/or FlowTimeout (pvfs2.conf)  have expired.  We're not sure if that
> is a seperate issue, it hangs and dies w/o this patch, now we just
> hang.  This fixes the segfault which is a more major issue.
> Patch is against CVS head from 20 minutes ago.

Yep, good stuff.  I was too eager in adding extra cancel states
back when Troy pointed out there were problems.  This fixes the
bugs that went in with that patch on 15 feb 08.  I think it should
do the saame thing as what you were aiming at.  Let me know if
you like it.

		-- Pete

P.S.  Put a line "diff -uNp" in your ~/.cvsrc.  Unified diffs are
so much easier to read.

commit 0872d4cace8e9cc98ac1a0831d56fb784f2fc0ff
Author: Pete Wyckoff <pw at osc.edu>
Date:   Tue Apr 8 14:40:37 2008 -0400

    bmi ib: fix double free in cancel
    
    Earlier checkin to clean up more thoroughly during a cancel
    errantly introduced a double free in certain states.  Remove
    those.

diff --git a/src/io/bmi/bmi_ib/ib.c b/src/io/bmi/bmi_ib/ib.c
index f8d6b41..76deb7d 100644
--- a/src/io/bmi/bmi_ib/ib.c
+++ b/src/io/bmi/bmi_ib/ib.c
@@ -1496,8 +1496,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id context_id __unused)
 	    /* pin when sending rts, so also must dereg in this state */
 	    if (sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION ||
 	        sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION_GOT_CTS ||
-	        sq->state.send == SQ_WAITING_CTS ||
-		sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
+	        sq->state.send == SQ_WAITING_CTS)
 		memcache_deregister(ib_device->memcache, &sq->buflist);
 #  endif
 #endif
@@ -1512,8 +1511,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id context_id __unused)
 		memcache_deregister(ib_device->memcache, &rq->buflist);
 #  if MEMCACHE_EARLY_REG
 	    /* pin on post, dereg all these */
-	    if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
-	        rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
+	    if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION)
 		memcache_deregister(ib_device->memcache, &rq->buflist);
 	    if (rq->state.recv == RQ_WAITING_INCOMING
 	      && rq->buflist.tot_len > ib_device->eager_buf_payload)


More information about the Pvfs2-developers mailing list