[Pvfs2-developers] msgpair error
Sam Lang
slang at mcs.anl.gov
Fri Jun 16 18:05:55 EDT 2006
On Jun 15, 2006, at 2:20 PM, Pete Wyckoff wrote:
> rross at mcs.anl.gov wrote on Thu, 15 Jun 2006 13:59 -0500:
>> What's actually just as curious to me is that we don't ever call
>> PINT_serv_free_msgpair_resources() in error cases if we *don't*
>> succeed.
>> Does this mean that every msgpairarray failure results in some
>> resources
>> being leaked? Making that "else" block would effectively just skip
>> this
>> code, avoiding the core dump but not freeing the resources.
>>
>> Perhaps we should have PINT_serv_decode_resp() initialize
>> decoded_resp
>> prior to exit and have PINT_serv_free_msgpair_resources()
>> understand a
>> NULL value for that input parameter?
>
> Good point. The server response came with this status:
>
> (gdb) p decoded_resp_p->stub_dec.resp.status
> $5 = -1073741828
>
> But in the decode process, after decoded the status, we find it
> is EIO, and skip the rest of the decoding in lebf_decode_resp,
> so the contents of u.lookup_path were never initialized:
>
> (gdb) p decoded_resp_p->stub_dec.resp.u.lookup_path
> $6 = {handle_array = 0xf8, attr_array = 0x537c30, handle_count =
> 5471472,
> attr_count = 0}
>
> And later we try to free(handle_array).
>
> The special case EIO handling in PINT-le-bytefield.c:605
> was added, for a reason, maybe because the server sends a bogus
> message that shouldn't be decoded. Here's robl's log message when
> adding the check:
>
> revision 1.29
> date: 2004/09/29 20:32:09; author: robl; state: Exp; lines:
> +49 -39
> probably did not do this in a general-enough way: don't try to
> decode/encode
> response-specific portions of a server response if something
> catastrophic
> happened on the server end.
>
> I'm sure once Sam takes a look at the dbpf issue, the EIO problem
> will go away. Not sure what the right way to design this client-
> side probem is, though, in case we get more server EIO issues in the
> future.
I haven't been able to reproduce this specific error or get dbench to
fail for me, but Murali just committed a bunch of fixes to bugs in
the keyval changes I made recently that I could imagine causing a lot
of weird behaviors. If you find that these commits fix the dbench
errors you were seeing, let me know. I'd prefer to wait to cleanup
the error handling on the server (db->err and the weird EIO case) as
well as the msgpairarray problems until after the release.
-sam
>
> -- Pete
> _______________________________________________
> Pvfs2-developers mailing list
> Pvfs2-developers at beowulf-underground.org
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>
More information about the Pvfs2-developers
mailing list