Skip to content

Streams: faster response to potential network partitions #9275

@kjnilsson

Description

@kjnilsson

Is your feature request related to a problem? Please describe.

Streams rely on erlang monitors for failure detection which in turn rely on the net_ticktime and net_intensity settings. This means it could take around 1 minute to detect that a writer is on a node that is no longer responding (either by being partitioned or because it had been force killed and the other nodes did not detect yet that the TCP connection has gone down).

Ra uses the aten library for failure detection and thus can fail over more quickly and regain availability. Ra can handle false positives very well where streams have a less light weight election process (the stream cluster needs to be restarted) and thus we still want to ensure we try very hard to act on false positives. That said it should be possible to do better than 1 minute by default. It is likely we can never reliably go as low as Ra which normally completes a new election within 5-10 seconds when a leader is partitioned off.

Describe the solution you'd like

When the stream coordinator becomes leader it spawns a companion process which will register for node notifications from the aten application. When it receives a node down it can, like Ra, wait one aten poll interval and check if the node that is assumed to be down is still in nodes(). If it is and aten still thinks it is down it is likely it has been partitioned off (or hard killed). In this case this process can query the stream coordinator for all streams that have leaders (writers) on this node.

After receiving the results the process can check again if aten thinks the node is down and issue a restart stream command against the stream coordinator for each stream with a leader on the down node to force a new stream election.

If at any point the process detects that the local stream coordinator member is no longer the leader it should cease operations and hibernate until notified.

the stream coordinator will activate the local process by sending a message to it from the state_enter(leader, ... callback.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions