-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Is your feature request related to a problem? Please describe.
Streams rely on erlang monitors for failure detection which in turn rely on the net_ticktime
and net_intensity
settings. This means it could take around 1 minute to detect that a writer is on a node that is no longer responding (either by being partitioned or because it had been force killed and the other nodes did not detect yet that the TCP connection has gone down).
Ra uses the aten
library for failure detection and thus can fail over more quickly and regain availability. Ra can handle false positives very well where streams have a less light weight election process (the stream cluster needs to be restarted) and thus we still want to ensure we try very hard to act on false positives. That said it should be possible to do better than 1 minute by default. It is likely we can never reliably go as low as Ra which normally completes a new election within 5-10 seconds when a leader is partitioned off.
Describe the solution you'd like
When the stream coordinator becomes leader it spawns a companion process which will register for node notifications from the aten
application. When it receives a node down it can, like Ra, wait one aten poll interval and check if the node that is assumed to be down is still in nodes()
. If it is and aten still thinks it is down it is likely it has been partitioned off (or hard killed). In this case this process can query the stream coordinator for all streams that have leaders (writers) on this node.
After receiving the results the process can check again if aten
thinks the node is down and issue a restart stream command against the stream coordinator for each stream with a leader on the down node to force a new stream election.
If at any point the process detects that the local stream coordinator member is no longer the leader it should cease operations and hibernate until notified.
the stream coordinator will activate the local process by sending a message to it from the state_enter(leader, ...
callback.
Describe alternatives you've considered
No response