[Pvfs2-developers] Re: double-free in BMI_ib_cancel

Sam Lang slang at mcs.anl.gov
Tue Apr 8 16:03:35 EDT 2008


On Apr 8, 2008, at 2:21 PM, Kyle Schochenmaier wrote:
> I think this is a fairly critical fix and will affect the majority of
> the ib users I know of, who are all running cvs head for various
> reasons.
> The patch only moves two #defines around so I'm not sure why we should
> hold off on it, maybe I'm missing something?

Nope, I just need to know whether to merge it to the release branch or  
not.
-sam

>
>
> Kyle
>
> On Tue, Apr 8, 2008 at 1:47 PM, Sam Lang <slang at mcs.anl.gov> wrote:
>>
>> Does this need to go in the next release?
>> -sam
>>
>>
>>
>> On Apr 8, 2008, at 1:43 PM, Pete Wyckoff wrote:
>>
>>>
>>>
>>>
>>> kschoche at gmail.com wrote on Tue, 08 Apr 2008 13:17 -0500:
>>>
>>>> Troy and I stumbled across this bug, that at least for our
>>>> configurations, causes a double-free on the server when cleaning up
>>>> 'stale' connections.
>>>>
>>>> It seems that some logic is duplicated here when cleaning up
>>>> connections that disappear to the server:
>>>>
>>>> #if !MEMCACHE_BOUNCEBUF
>>>>          if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)  /*
>>>> --------------- here-------------- */
>>>>              memcache_deregister(ib_device->memcache, &rq- 
>>>> >buflist);
>>>> #  if MEMCACHE_EARLY_REG
>>>>          /* pin on post, dereg all these */
>>>>          if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION  
>>>> ||
>>>>              rq->state.recv == RQ_RTS_WAITING_RTS_DONE)    /*
>>>> ----------------- and here ------------ */
>>>>              memcache_deregister(ib_device->memcache, &rq- 
>>>> >buflist);
>>>>          if (rq->state.recv == RQ_WAITING_INCOMING
>>>>            && rq->buflist.tot_len > ib_device->eager_buf_payload)
>>>>              memcache_deregister(ib_device->memcache, &rq- 
>>>> >buflist);
>>>> #  endif
>>>>
>>>> The patch below worked cleans it up, and compiles and runs  
>>>> *here*, but
>>>> we have some weird hanging issues after closing connections,  
>>>> basically
>>>> no new unexpected requests are processed until after the BMItimeout
>>>> and/or FlowTimeout (pvfs2.conf)  have expired.  We're not sure if  
>>>> that
>>>> is a seperate issue, it hangs and dies w/o this patch, now we just
>>>> hang.  This fixes the segfault which is a more major issue.
>>>> Patch is against CVS head from 20 minutes ago.
>>>>
>>>
>>> Yep, good stuff.  I was too eager in adding extra cancel states
>>> back when Troy pointed out there were problems.  This fixes the
>>> bugs that went in with that patch on 15 feb 08.  I think it should
>>> do the saame thing as what you were aiming at.  Let me know if
>>> you like it.
>>>
>>>               -- Pete
>>>
>>> P.S.  Put a line "diff -uNp" in your ~/.cvsrc.  Unified diffs are
>>> so much easier to read.
>>>
>>> commit 0872d4cace8e9cc98ac1a0831d56fb784f2fc0ff
>>> Author: Pete Wyckoff <pw at osc.edu>
>>> Date:   Tue Apr 8 14:40:37 2008 -0400
>>>
>>>  bmi ib: fix double free in cancel
>>>
>>>  Earlier checkin to clean up more thoroughly during a cancel
>>>  errantly introduced a double free in certain states.  Remove
>>>  those.
>>>
>>> diff --git a/src/io/bmi/bmi_ib/ib.c b/src/io/bmi/bmi_ib/ib.c
>>> index f8d6b41..76deb7d 100644
>>> --- a/src/io/bmi/bmi_ib/ib.c
>>> +++ b/src/io/bmi/bmi_ib/ib.c
>>> @@ -1496,8 +1496,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id
>> context_id __unused)
>>>           /* pin when sending rts, so also must dereg in this  
>>> state */
>>>           if (sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION ||
>>>               sq->state.send ==  
>>> SQ_WAITING_RTS_SEND_COMPLETION_GOT_CTS ||
>>> -               sq->state.send == SQ_WAITING_CTS ||
>>> -               sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
>>> +               sq->state.send == SQ_WAITING_CTS)
>>>               memcache_deregister(ib_device->memcache, &sq- 
>>> >buflist);
>>> #  endif
>>> #endif
>>> @@ -1512,8 +1511,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id
>> context_id __unused)
>>>               memcache_deregister(ib_device->memcache, &rq- 
>>> >buflist);
>>> #  if MEMCACHE_EARLY_REG
>>>           /* pin on post, dereg all these */
>>> -           if (rq->state.recv ==  
>>> RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
>>> -               rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
>>> +           if (rq->state.recv ==  
>>> RQ_RTS_WAITING_CTS_SEND_COMPLETION)
>>>               memcache_deregister(ib_device->memcache, &rq- 
>>> >buflist);
>>>           if (rq->state.recv == RQ_WAITING_INCOMING
>>>             && rq->buflist.tot_len > ib_device->eager_buf_payload)
>>> _______________________________________________
>>> Pvfs2-developers mailing list
>>> Pvfs2-developers at beowulf-underground.org
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>>
>>>
>>
>>
>
>
>
> -- 
> Kyle Schochenmaier
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2417 bytes
Desc: not available
Url : http://www.beowulf-underground.org/pipermail/pvfs2-developers/attachments/20080408/63325977/smime-0001.bin


More information about the Pvfs2-developers mailing list