Resolving Server Stoppage Issues in Typesense Multi VM Cluster
TLDR gaurav faced issues with the Typesense server in a multi VM cluster, including automatic stoppage and errors. Kishore Nallan identified the lack of a quorum and suggested using three nodes. When the issue persisted, they advised running Typesense via nohup
or systemd
to prevent session closure from stopping the process.
2
Nov 17, 2022 (13 months ago)
gaurav
03:55 AMi am using ssm to access VM and what i have seen typesense on docker works perfectly but typesense service such as
typesense-server
crashes after a while. any ideas?also now i am again trying to set up cluster one VM is throwing 503 error.
Kishore Nallan
03:56 AMgaurav
04:18 AMIts a 2 VM configuration. VM1 throws
503, {'ok': False}
in health and VM2 200 {'ok': True}
To start server i had started using below commands
VM1: sudo typesense-server \
> --data-dir /home/charlie/cluster/typesense \
> --api-key=xyz \
> --api-address 0.0.0.0 \
> --api-port 8108 \
> --peering-address 10.212.22.59 \
> --peering-port 8107 \
> --log-dir=/home/charlie/cluster/logs \
> --nodes=/home/charlie/cluster/nodes
Log directory is configured as: /home/charlie/cluster/logs```
VM2: sudo typesense-server \> --data-dir /home/charlie/cluster/typesense \
> --api-key=xyz \
> --api-address 0.0.0.0 \
> --api-port 8108 \
> --peering-address 10.212.22.189 \
> --peering-port 8107 \
> --log-dir=/home/charlie/cluster/logs
Log directory is configured as: /home/charlie/cluster/logs
E20221117 09:35:39.196200 2809 raft_server.cpp:589] Node not ready yet (known_applied_index is 0).
E20221117 09:35:39.196218 2817 raft_server.h:62] Peer refresh failed, error: Doing another configuration change
``
When i debug the API i am getting
state 4 for VM1 and
state 1` for VM2.Also attached logs
problem is for VM1 i dont know what’s the issue its throwing 503 error. Altough i can see from debug it is working as expected.
gaurav
06:13 AMKishore Nallan
06:17 AMgaurav
06:17 AMKishore Nallan
06:18 AMgaurav
06:19 AMgaurav
06:20 AMStill after 50-60 minutes, typesense server automatically stops.
It works perfectly in docker, any ideas?
Kishore Nallan
06:21 AMgaurav
06:21 AMKishore Nallan
06:22 AM1
gaurav
09:03 AMI am using aws EC2 instances using SSM.
From logs i can see my typesense server starts from
13.56 till 14.16
after that it automatically stops. You can see the logs but there isnt any which can point what’s the issue.I20221117 14:14:45.019073 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:14:45.019165 21940 raft_server.h:60] Peer refresh succeeded!
I20221117 14:14:55.020509 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:14:55.020674 21942 raft_server.h:60] Peer refresh succeeded!
I20221117 14:15:05.022079 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:15:05.022176 21951 raft_server.h:60] Peer refresh succeeded!
I20221117 14:15:15.023499 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:15:15.023589 21940 raft_server.h:60] Peer refresh succeeded!
I20221117 14:15:25.025126 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:15:25.025230 21942 raft_server.h:60] Peer refresh succeeded!
I20221117 14:15:33.873915 21929 batched_indexer.cpp:250] Running GC for aborted requests, req map size: 0
I20221117 14:15:35.026558 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:15:35.026647 21951 raft_server.h:60] Peer refresh succeeded!
I20221117 14:15:45.028023 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:15:45.028113 21940 raft_server.h:60] Peer refresh succeeded!
I20221117 14:15:55.029551 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:15:55.029634 21942 raft_server.h:60] Peer refresh succeeded!
I20221117 14:16:05.030953 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:16:05.031044 21951 raft_server.h:60] Peer refresh succeeded!
I20221117 14:16:15.032476 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:16:15.032645 21940 raft_server.h:60] Peer refresh succeeded!
I20221117 14:16:25.034013 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:16:25.034106 21942 raft_server.h:60] Peer refresh succeeded!
I20221117 14:16:34.881245 21929 batched_indexer.cpp:250] Running GC for aborted requests, req map size: 0
I20221117 14:16:35.035606 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:16:35.035696 21951 raft_server.h:60] Peer refresh succeeded!
I20221117 14:16:45.037305 21928 raft_server.cpp:534] Term: 32, last_index index: 161, committed_index: 161, known_applied_index: 161, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 759502
I20221117 14:16:45.037391 21940 raft_server.h:60] Peer refresh succeeded!
Now when i check in
/var/log/messages
i can see some logs pertaining to same time as typesense server.
Nov 17 13:56:14 ds-pro-search-0202 systemd-logind: New session c70 of user root.
Nov 17 14:16:53 ds-pro-search-0202 systemd-logind: Removed session c70.
Do you know anything on this?
Kishore Nallan
09:04 AMgaurav
09:04 AMKishore Nallan
09:05 AMgaurav
09:05 AMKishore Nallan
09:05 AMgaurav
09:06 AMKishore Nallan
09:07 AMKishore Nallan
09:08 AM/var/log/messages
Kishore Nallan
09:08 AMtypesense
service.gaurav
09:12 AMWill update on this. However thanks for this will test out.
gaurav
09:14 AMgaurav
09:14 AMKishore Nallan
09:14 AMgaurav
09:14 AMKishore Nallan
09:14 AMgaurav
09:21 AMJust for refrence
CentOS Linux release 7.9.2009
Kishore Nallan
09:24 AMIn any case, the quickest fix for you now is to change your command structure to run via
nohup
this way:nohup ./typesense-server <arguments> &
This will 100% ensure that process is not killed when session closes.
gaurav
05:38 PMsystemd
1
Typesense
Indexed 3011 threads (79% resolved)
Similar Threads
Addressing High CPU Usage in Typesense
Robert reported high CPU usage on Typesense, even after halting all incoming searches. Kishore Nallan suggested logging heavy queries and increasing thread count. The issue was resolved after Robert found and truncated unusually large documents in the database.
Troubleshooting Stalled Writes in TypeSense Instance
Robert was experiencing typesense instances getting stuck after trying to import documents. Kishore Nallan provided suggestions and added specific logs to diagnose the issue. The two identified queries causing troubles but the issues had not been fully resolved yet.
Large JSONL Documents Import Issue & Resolution
Suraj was having trouble loading large JSONL documents into Typesense server. After several discussions and attempts, it was discovered that the issue was due to data quality. Once the team extracted the data again, the upload process worked smoothly.
Typesense Node Stuck in Segfault Loop After Stress Test
Adrian encountered a segfault loop when stress testing a Typesense cluster. Kishore Nallan recommended trying a newer RC build and suggested potential issues with hostname resolution.
Troubleshooting Typesense 503 Errors and Usage Queries
Kevin encountered 503s using typesense. Jason asked for logs and explained why 503s occur. They made recommendations to remedy the issue and resolved Kevin's import parameter confusion. User was asked to open a github issue for accepting booleans.