#community-help

Troubleshooting Typesense Cluster Multi-node Leadership Error

TLDR Bill experienced a problem with a new typesense cluster, receiving an error about no leader and health status issues. Jason and Kishore Nallan provided troubleshooting steps and determined it was likely due to a communication issue between nodes. Kishore Nallan identified a potential solution involving resetting the data directory. Following this, Bill reported the error resolved.

Powered by Struct AI
Jan 16, 2022 (24 months ago)
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
10:01 PM
Hello, we just set up a new typesense cluster and we receive -> Multi-node with no leader: refusing to reset peers.. We hadn't any issue before. All health status in nodes are false and on debug state:4. Any solution?
Jan 17, 2022 (24 months ago)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
01:56 AM
Bill If you have at least 3 nodes, then it's most likely that one of the nodes is not able to communicate with one or more nodes. So it's not able to get a quorum to elect a leader in the cluster
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
10:04 AM
Jason the steps that I followed are: 1) wget package , 2) install , 3) stop server , 4) create nodes file in etc/typesense , 5) update typesense-server.ini with nodes, peering up etc. 6) start the nodes one by one
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
10:12 AM
Can you post the actual logs from any one node? It seems like a firewall issue to me. Maybe try telnet to the host and port from one of the nodes to the other nodes.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:14 AM
W20220117 11:12:53.172574 2229 raft_server.cpp:551] Multi-node with no leader: refusing to reset peers.
I20220117 11:12:54.934175 2240 node.cpp:1484] node default_group:10.114.0.4:8107:8108 term 2 start pre_vote
W20220117 11:12:54.934615 2240 node.cpp:1494] node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
I20220117 11:13:00.795843 2239 node.cpp:1484] node default_group:10.114.0.4:8107:8108 term 2 start pre_vote
W20220117 11:13:00.796419 2239 node.cpp:1494] node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
I20220117 11:13:03.173813 2229 raft_server.cpp:524] Term: 2, last_index index: 1, committed_index: 0, known_applied_index: 0, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 0
W20220117 11:13:03.173848 2229 raft_server.cpp:551] Multi-node with no leader: refusing to reset peers.
I20220117 11:13:06.508644 2237 node.cpp:1484] node default_group:10.114.0.4:8107:8108 term 2 start pre_vote
W20220117 11:13:06.508692 2237 node.cpp:1494] node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
11:14
Bill
11:14 AM
Kishore Nallan That's the logs in all nodes
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:16 AM
> node default_group:10.114.0.4:8107:8108 can't do pre_vote as it is not in 10.19.0.7:8107:8108
Looks like the node was part of another cluster earlier? I see two different IPs there from entirely different subnets as well.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:16 AM
Yes me too
11:16
Bill
11:16 AM
But there is no other cluster
11:16
Bill
11:16 AM
I just created 3 new droplets in digital ocean
11:16
Bill
11:16 AM
and deployed typesense
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:16 AM
Can you stop the nodes, delete the data directory and start one by one again?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:18 AM
I have try many times with different order. I stopped 1 node to stabilize like a single node and then I updated the typesense-server.ini in order to work like a multinode
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:18 AM
Can you post the contents of the ini file?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:18 AM
I get the same error again _> Multi-node with no leader: refusing to reset peers
11:19
Bill
11:19 AM
api-address = 0.0.0.0
api-port = 8108
data-dir = /var/lib/typesense
api-key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
log-dir = /var/log/typesense
peering-address = 10.114.0.4
peering-port = 8107
nodes = /etc/typesense/nodes.txt
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:20 AM
Ok that's fine, content of /etc/typesense/nodes.txt ?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:20 AM
Like the line you have on docs
11:20
Bill
11:20 AM
nothing special
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:20 AM
What is this IP: 10.19.0.7 -> is it some other node on your infrastructure.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:20 AM
10.114.0.2:8107:8108,10.114.0.3:8107:8108,10.114.0.4:8107:8108
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:21 AM
If you clear the data directory and restart, there is no way for that IP to come into the picture...
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:21 AM
I don't have any node with this ip on digital ocean
11:21
Bill
11:21 AM
I have deleted even the droplets and create new ones but this IP is there again
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:21 AM
Ok, let's try this:

1. Stop Typesense server on all nodes
2. Do rm -rf /var/lib/typesense/* on all nodes.
3. Start any one node and check the logs. What do they say?
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:22 AM
start the node as single or multi?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:22 AM
Just start it as it is with the multi node configuration.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:22 AM
ok just a moment
11:25
Bill
11:25 AM
I got connection refused and I started the other 2 nodes now I get peer refresh suceed
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:26 AM
Yeah you are good to go then. I'm sure that the data directory had some previous state.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:26 AM
Maybe but this a was an install on a new instance
11:26
Bill
11:26 AM
without data
11:27
Bill
11:27 AM
What exactly is -> rm -rf /var/lib/typesense/*
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:27 AM
It deletes the contents of the typesense data directory.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:27 AM
Is it safe to use it with records on production?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:28 AM
You mean whether the cluster is fine to use on production now? Yes, absolutely. I have no idea what happened earlier, but this is clean state.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:30 AM
Yes for example, I have a 3 node cluster now and I want to add 2 more nodes (5 node cluster). If I get this error (Multi-node with no leader: refusing to reset peers) can I use this command?
11:30
Bill
11:30 AM
Without losing data
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:31 AM
This will only happen when you mess up the cluster state by replacing all nodes with different IPs at once.
11:31
Kishore Nallan
11:31 AM
If you want to be sure and get to the bottom of the original issue, maybe you can try doing the cluster provisioning from scratch to confirm.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:32 AM
What do you mean from scratch? Set the nodes.txt with 5 peering IPS even if they don't exist now?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:33 AM
I mean, you said that you faced this issue on a brand new droplet right. If you want you can try creating a cluster from new droplets again. But whatever happened earlier, cluster is fine now so should be fine.
11:33
Kishore Nallan
11:33 AM
To add 2 more nodes, you just need update the nodes file. It will work.
Bill
Photo of md5-be53735a2b0297bb542711c1d2ecea45
Bill
11:35 AM
Perfect okay, thank you Kishore!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:35 AM
Welcome.

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community