#community-help

Troubleshooting Typesense Cluster Multi-node Leadership Error

TLDR Bill experienced a problem with a new typesense cluster, receiving an error about no leader and health status issues. Jason and Kishore Nallan provided troubleshooting steps and determined it was likely due to a communication issue between nodes. Kishore Nallan identified a potential solution involving resetting the data directory. Following this, Bill reported the error resolved.

Powered by Struct AI
45
20mo
Solved
Join the chat
Jan 16, 2022 (20 months ago)
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
10:01 PM
Hello, we just set up a new typesense cluster and we receive -> Multi-node with no leader: refusing to reset peers.. We hadn't any issue before. All health status in nodes are false and on debug state:4. Any solution?
Jan 17, 2022 (20 months ago)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
01:56 AM
Bill If you have at least 3 nodes, then it's most likely that one of the nodes is not able to communicate with one or more nodes. So it's not able to get a quorum to elect a leader in the cluster
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
10:04 AM
Jason the steps that I followed are: 1) wget package , 2) install , 3) stop server , 4) create nodes file in etc/typesense , 5) update typesense-server.ini with nodes, peering up etc. 6) start the nodes one by one
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
10:12 AM
Can you post the actual logs from any one node? It seems like a firewall issue to me. Maybe try telnet to the host and port from one of the nodes to the other nodes.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:14 AM
W20220117 11:12:53.172574 2229 raft_server.cpp:551] Multi-node with no leader: refusing to reset peers.
I20220117 11:12:54.934175 2240 node.cpp:1484] node default_group:10.114.0.4:8107:8108 term 2 start pre_vote
W20220117 11:12:54.934615 2240 node.cpp:1494] node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
I20220117 11:13:00.795843 2239 node.cpp:1484] node default_group:10.114.0.4:8107:8108 term 2 start pre_vote
W20220117 11:13:00.796419 2239 node.cpp:1494] node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
I20220117 11:13:03.173813 2229 raft_server.cpp:524] Term: 2, last_index index: 1, committed_index: 0, known_applied_index: 0, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 0
W20220117 11:13:03.173848 2229 raft_server.cpp:551] Multi-node with no leader: refusing to reset peers.
I20220117 11:13:06.508644 2237 node.cpp:1484] node default_group:10.114.0.4:8107:8108 term 2 start pre_vote
W20220117 11:13:06.508692 2237 node.cpp:1494] node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
11:14
Bill
11:14 AM
Kishore Nallan That's the logs in all nodes
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:16 AM
> node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
Looks like the node was part of another cluster earlier? I see two different IPs there from entirely different subnets as well.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:16 AM
Yes me too
11:16
Bill
11:16 AM
But there is no other cluster
11:16
Bill
11:16 AM
I just created 3 new droplets in digital ocean
11:16
Bill
11:16 AM
and deployed typesense
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:16 AM
Can you stop the nodes, delete the data directory and start one by one again?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:18 AM
I have try many times with different order. I stopped 1 node to stabilize like a single node and then I updated the typesense-server.ini in order to work like a multinode
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:18 AM
Can you post the contents of the ini file?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:18 AM
I get the same error again _> Multi-node with no leader: refusing to reset peers
11:19
Bill
11:19 AM
api-address = 0.0.0.0
api-port = 8108
data-dir = /var/lib/typesense
api-key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
log-dir = /var/log/typesense
peering-address = 10.114.0.4
peering-port = 8107
nodes = /etc/typesense/nodes.txt
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:20 AM
Ok that's fine, content of /etc/typesense/nodes.txt ?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:20 AM
Like the line you have on docs
11:20
Bill
11:20 AM
nothing special
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:20 AM
What is this IP: 10.19.0.7 -> is it some other node on your infrastructure.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:20 AM
10.114.0.2:8107:8108,10.114.0.3:8107:8108,10.114.0.4:8107:8108
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:21 AM
If you clear the data directory and restart, there is no way for that IP to come into the picture...
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:21 AM
I don't have any node with this ip on digital ocean
11:21
Bill
11:21 AM
I have deleted even the droplets and create new ones but this IP is there again
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:21 AM
Ok, let's try this:

1. Stop Typesense server on all nodes
2. Do rm -rf /var/lib/typesense/* on all nodes.
3. Start any one node and check the logs. What do they say?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:22 AM
start the node as single or multi?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:22 AM
Just start it as it is with the multi node configuration.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:22 AM
ok just a moment
11:25
Bill
11:25 AM
I got connection refused and I started the other 2 nodes now I get peer refresh suceed
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:26 AM
Yeah you are good to go then. I'm sure that the data directory had some previous state.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:26 AM
Maybe but this a was an install on a new instance
11:26
Bill
11:26 AM
without data
11:27
Bill
11:27 AM
What exactly is -> rm -rf /var/lib/typesense/*
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:27 AM
It deletes the contents of the typesense data directory.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:27 AM
Is it safe to use it with records on production?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:28 AM
You mean whether the cluster is fine to use on production now? Yes, absolutely. I have no idea what happened earlier, but this is clean state.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:30 AM
Yes for example, I have a 3 node cluster now and I want to add 2 more nodes (5 node cluster). If I get this error (Multi-node with no leader: refusing to reset peers) can I use this command?
11:30
Bill
11:30 AM
Without losing data
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:31 AM
This will only happen when you mess up the cluster state by replacing all nodes with different IPs at once.
11:31
Kishore Nallan
11:31 AM
If you want to be sure and get to the bottom of the original issue, maybe you can try doing the cluster provisioning from scratch to confirm.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:32 AM
What do you mean from scratch? Set the nodes.txt with 5 peering IPS even if they don't exist now?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:33 AM
I mean, you said that you faced this issue on a brand new droplet right. If you want you can try creating a cluster from new droplets again. But whatever happened earlier, cluster is fine now so should be fine.
11:33
Kishore Nallan
11:33 AM
To add 2 more nodes, you just need update the nodes file. It will work.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:35 AM
Perfect okay, thank you Kishore!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:35 AM
Welcome.