Discussion About Typesense Nodes Not Synchronizing Correctly
TLDR Erick experienced an issue where documents weren't updated properly in a Typesense instance running on 3 nodes. Upon requesting debug logs and configs, Jason identified that these nodes weren't part of the same cluster. They couldn't resolve the nodes' failure to connect issue and recommended a fresh installation.
1
1
Feb 03, 2022 (21 months ago)
Erick
10:20 PMJason
10:20 PMErick
10:21 PMErick
10:21 PMJason
10:21 PMJason
10:22 PMErick
10:22 PMErick
10:22 PMErick
10:22 PMJason
10:23 PMErick
10:25 PMJason
10:27 PMJason
10:27 PMErick
10:29 PMErick
10:30 PMErick
10:30 PMErick
10:31 PMErick
10:32 PMJason
10:33 PMErick
10:44 PMFirst time
• Change Firebase, Typesense-Ext executed, Typesense document result (Using curl) = old parameter.
Second time
• Same as
first time
Third Time
• Wait to change the document, perform same steps, Typesense document result (Using curl) = new parameter.
Jason
10:48 PMErick
10:51 PMJason
11:08 PMJason
11:08 PMErick
11:13 PMJason
11:16 PMJason
11:16 PMErick
11:21 PMJason
11:23 PMJason
11:24 PMErick
11:28 PMDebug output:
{
"state": 1,
"version": "0.22.1"
}
Server 2
Debug output:
{
"state": 1,
"version": "0.22.1"
}
Server 3
Debug output:
{
"state": 1,
"version": "0.22.1"
}
Jason
11:30 PMstate: 1
on one node and state: 4
on the other two nodesErick
11:30 PMErick
11:31 PMErick
11:31 PMJason
11:31 PMErick
11:31 PMErick
11:48 PM; Typesense Configuration
[server]
api-address = company-ip-1
api-port = 443
data-dir = /var/lib/typesense
log-dir = /var/log/typesense
api-key = typesense-api-key
peering-address = 192.168.199.212
peering-port = 8107
nodes = /etc/typesense/nodes
ssl-certificate = /etc/letsencrypt/live/test1.example.com/fullchain.pem
ssl-certificate-key = /etc/letsencrypt/live/test2.example.com/privkey.pem
Server 2
; Typesense Configuration
[server]
api-address = company-ip-2
api-port = 443
data-dir = /var/lib/typesense
log-dir = /var/log/typesense
api-key = typesense-api-key
peering-address = 192.168.199.3
peering-port = 8107
nodes = /etc/typesense/nodes
ssl-certificate = /etc/letsencrypt/live/test2.example.com/fullchain.pem
ssl-certificate-key = /etc/letsencrypt/live/test2.example.com/privkey.pem
Server 3
; Typesense Configuration
[server]
api-address = company-ip-3
api-port = 443
data-dir = /var/lib/typesense
log-dir = /var/log/typesense
api-key = typesense-api-key
peering-address = 192.168.199.25
peering-port = 8107
nodes = /etc/typesense/nodes
ssl-certificate = /etc/letsencrypt/live/test3.example.com/fullchain.pem
ssl-certificate-key = /etc/letsencrypt/live/test3.example.com/privkey.pem
Erick
11:54 PM192.168.199.212:8107:443,192.168.199.3:8107:443,192.168.199.25:8107:443
Jason
11:57 PMJason
11:58 PMtelnet 192.168.199.3 8107
should show you a prompt when you run it from the other two hostsFeb 04, 2022 (21 months ago)
Erick
12:06 AMErick
12:07 AMJason
12:11 AMErick
12:32 AMErick
12:32 AMErick
12:33 AMJason
12:33 AMJason
12:33 AMErick
12:33 AMJason
12:34 AMsudo systemctl restart typesense-server.service
Jason
12:34 AMErick
12:39 AMJason
12:39 AM1
Erick
12:40 AMJason
12:41 AMJason
12:41 AMErick
12:43 AMErick
12:46 AMJason
12:57 AMJason
12:57 AMJason
12:58 AMErick
01:02 AMJason
01:02 AM/var/log/typesense/typesense.log
Erick
01:12 AMI20211215 05:19:19.205374 771 node.cpp:722] node default_group:192.168.199.3:8107:443 waits peer 192.168.199.212:8107:443 to catch up
I20211215 05:19:19.205453 771 node.cpp:722] node default_group:192.168.199.3:8107:443 waits peer 192.168.199.25:8107:443 to catch up```
W20211215 05:19:19.206537 774 replicator.cpp:392] Group default_group fail to issue RPC to 192.168.199.212:8107:443 _consecutive_error_times=11, [E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322 [R1][E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322 [R2][E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322 [R3][E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322W20211215 05:19:19.206666 774 replicator.cpp:292] Group default_group fail to issue RPC to 192.168.199.25:8107:443 _consecutive_error_times=11, [E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684 [R1][E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684 [R2][E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684 [R3][E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684
W20211215 05:19:20.806144 771 socket.cpp:1193] Fail to wait EPOLLOUT of fd=28: Connection timed out [110]W20211215 05:19:20.806241 771 socket.cpp:1193] Fail to wait EPOLLOUT of fd=26: Connection timed out [110]```
Jason
01:13 AMJason
01:13 AMErick
01:15 AMErick
01:40 AMPeer refresh failed, error: Peer 192.168.199.212:8107:443 failed to catch up
Erick
01:43 AMErick
01:43 AMErick
01:45 AMRunning GC for aborted requests, req map size: 0
I20211214 17:47:19.990471 673 raft_server.cpp:524] Term: 5, last_index index: 5, committed_index: 5, known_applied_index: 5, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 1
Jason
01:45 AMJason
01:46 AMErick
01:46 AMErick
01:47 AMJason
01:47 AMErick
01:54 AMStarted Typesense Server.
Log directory is configured as: /var/log/typesense
Peer refresh failed, error: Doing another configuration change
Jason
01:55 AMJason
01:55 AMErick
01:57 AMJason
01:59 AMJason
01:59 AMErick
02:03 AMErick
02:08 AMKishore Nallan
02:11 AMrm -rf /var/lib/typesense/*
) and starting them one by one again?Erick
02:14 AMErick
02:22 AMKishore Nallan
02:50 AMErick
03:04 AMErick
03:04 AM1
Typesense
Indexed 2779 threads (79% resolved)
Similar Threads
Troubleshooting 400 Error When Upgrading Typesense Firestore Extension
Orion experienced a `400` error after updating the Typesense Firestore extension, causing issues with cloud functions. They traced the issue back to a data type conflict in their Typesense collection schema after updating. With help from Jason and Kishore Nallan, they resolved the issue by recreating the collection.
Debugging and Recovery of a Stuck Typesense Cluster
Charlie had a wedged staging cluster. Jason provided debugging and recovery steps, and Adrian helped with more insights. It turns out the issue was insufficient disk space. Once Adrian increased the disk size, the cluster healed itself.
Troubleshooting Typesense Cluster Mode Error Logs
gaurav was having troubles with Typesense error logs in cluster mode. Kishore Nallan explained that it can occur due to master's location troubles and node restarts or state resets, especially in a 3-node setup.
Addressing Typesense Server Issues and Optimization Needs
Robert had an issue with a 'stuck' typesense server. Jason and Kishore Nallan gave advice on handling writes, configuration for high search volumes, and running multiple typesense instances. They also recommended monitoring CPU usage and updating the server version for bug fixes.
Troubleshooting Typesense Cluster Multi-node Leadership Error
Bill experienced a problem with a new typesense cluster, receiving an error about no leader and health status issues. Jason and Kishore Nallan provided troubleshooting steps and determined it was likely due to a communication issue between nodes. Kishore Nallan identified a potential solution involving resetting the data directory. Following this, Bill reported the error resolved.