#community-help

Typesense Cluster Operations and Recovery

TLDR Adrian inquired about Typesense handling reads and recovering from quorum loss in high availability settings. Kishore Nallan clarified the design decisions and mentioned the automatic recovery in specific scenarios with v0.25 RC.

Powered by Struct AI

2

1

5
8mo
Solved
Join the chat
Mar 28, 2023 (8 months ago)
Adrian
Photo of md5-27ff63286c7b3dcb91085f39e910c437
Adrian
10:25 PM
Hello my company is evaluating Typesense to replace Algolia for our search use cases. We will be self hosting due to customer security requirements and have a high availability SLA for this service. I have a few questions based on this section of the Typesense documentation on HA. Cc Jason since we talked about adjacent things when we met last Friday. For context I am generally familiar with the raft algorithm (I implemented a simple version a few years ago), but am not familiar with how it is used with Typesense or the implementation details of Braft.
1. It makes sense to me that Typesense stops accepting writes when quorum is lost, but why does it stop accepting reads? Given that reads are "served by the node that receives it" during normal operation. I don't see why reads could not continue to be served by any nodes still running once quorum is lost. The reads should be no more out of date than already possible in normal operation - unless I am missing something here
2. Why can the cluster not recover once quorum is lost without manual intervention? If quorum is lost, but then the down nodes come back online. I would expect a normal election to be possible and for a new leader to be elected. The documentation cites the risk of a split brain, but afaik this is not possible in raft as any writes require ack from a majority of nodes, thus there can be only one active leader at a time.
I appreciate any input on these point!
Mar 29, 2023 (8 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:29 AM
1. When all 3 nodes are unable to form a cluster, then yes, accepting reads is not a problem. But, in a case when you have 2/3 nodes operating as a cluster and the third node is partitioned, 2/3 nodes will receive writes and be up-to-date than the 3rd node -- that can cause confusing user behaviors. It's difficult to differentiate between these 2 scenarios so we are taking the safer route.
2. Cluster does recover in case of quorum loss automatically. If you mean the issues that we have seen on kubernetes, then that's different in nature. The raft nodes are identified by their IP address so when all 3 pods are replaced and come back up new IP addresses, the new pods think they are misconfigured because their IP address do not match what's on the internal cluster state. I've just done some work to overcome this issue automatically now. Please try this out and let me know if that helps: https://typesense-community.slack.com/archives/C01P749MET0/p1680002564216939?thread_ts=1678739509.970809&cid=C01P749MET0
01:36
Kishore Nallan
01:36 AM
Caveat: I'm assuming that the quorum loss is not because of data loss. I assume that the data directory is persistent.

1

Adrian
Photo of md5-27ff63286c7b3dcb91085f39e910c437
Adrian
02:44 AM
1. That design decision makes sense. Thanks for clarifying!
2. Makes sense and great to hear! I will try it out. But does that mean this paragraph from the documentation is not accurate?
> If a Typesense cluster loses more than (N-1/2) nodes at the same time, the cluster becomes unstable because it loses quorum and the remaining node(s) cannot safely build consensus on which node is the leader. To avoid a potential split brain issue, Typesense then stops accepting writes and reads until some manual verification and intervention is done.
This makes me think manual intervention is required in the case of quorum loss, but what you said about it being automatic fits with what I would expect.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:01 AM
Depends on nature of quorum loss. If 2 of 3 intances just die or their network is permanently cut then the remaining node will reject both reads and writes due to no. 1 above. Then manual intervention is needed to restore network and/or the broken machines.

The auto recovery is specifically for the scenario when the nodes come back up but with different IPs and try to form the cluster again. This previously required manual intervention in v0.24, but with 0.25 RC (which is still a beta build) we can handle this. Still needs more testing.

1

1