Use the first sister message as a Heartbeat, too #93
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've noticed that when the sister is killed quickly after being started, there is (very much repeatable) chance to deadlock the other Bond. Consider this sequence:
AwaitingSister->SisterAlive()
which callsConnect()
.Connect()
stops the connection timer.Alive
state. However,Heartbeat()
has not yet been called (that would be done by the second message which did not come), soheartbeat_timer_
is still off and does not trigger the timeout event.The fix I sent fixes it on the C++ side. I'm not sure about the Python side, and not sure about the SM source code. The
Heartbeat()
needs to be called already in theAlive
state (it is undefined inAwaitingSister
). Quickly skimming through the SM syntax, it doesn't seem to me it would support calling some functions before the transition and some after it.If that would be the case, I'd suggest calling
Heartbeat
in theAwaitingSister
state (right after callingConnected()
) and defining it forAwaitingSister
state to do the same as for theAlive
state. That would require no additional changes to the SM definition, so it might be preferred.I'll let the maintainers decide which approach would be better.