[Pvfs2-developers] Re: bmi_ib resource constraints with older
hardware
Troy Benjegerdes
troy at scl.ameslab.gov
Wed Mar 12 22:29:49 EST 2008
> Just hack up anything you like to get it to work. If it fixes the
> situation, we'll go back and clean up the code later.
>
> It is optimistic, what you're trying to do, but I'm not sure if it
> will be sufficient. If there are no credits to get back from
> checking the CQ, you'll just deadlock. I'm also nervous about
> locking implications, as you're checking the CQ in the thread that
> is trying to do the send. Not sure if we have done this before.
>
> A simpler way would be just to just fail whatever operation got us
> into this RDMA, by abandoning it, with another state that says we're
> waiting on credits. An easier first step is just to add lots of
> printfs to track the credits and see if you can correlate a credit
> overflow with the rdma failures. If that works, a check at the top
> of "post rdma" can say whether we should even bother and we won't
> need your fixup step of looking at the CQ from the send.
>
> -- Pete
>
This seems to work a little better..
http://www.scl.ameslab.gov/~troy/pvfs/ibv_post_send/retry-ibv_post_send.diff
and gives output like this:
http://www.scl.ameslab.gov/~troy/pvfs/ibv_post_send/pvfs2-server-ib-da1.log
http://www.scl.ameslab.gov/~troy/pvfs/ibv_post_send/pvfs2-server-ib-da3.log
More information about the Pvfs2-developers
mailing list