Node Boot Errors in Typesense Cluster
TLDR Sergio experienced boot errors on a Typesense cluster node. Kishore Nallan suggested deleting the data directory and restarting, which resolved the issue.
Feb 27, 2023 (7 months ago)
Sergio
02:58 PMWe have a cluster of 3 nodes. While starting the third node (2 healthy nodes) we had the follow errors.
Any thoughts?
Sergio
02:59 PMOn the reboot of the Typesense node we got this error
I20230227 13:38:14.356796 94 store.h:307] DB open success!
I20230227 13:38:14.356813 94 raft_server.cpp:479] Loading collections from disk...
I20230227 13:38:14.356819 94 collection_manager.cpp:132] CollectionManager::load()
I20230227 13:38:14.356945 94 auth_manager.cpp:32] Indexing 9 API key(s) found on disk.
I20230227 13:38:14.357082 94 collection_manager.cpp:152] Loading upto 8 collections in parallel, 1000 documents at a time.
I20230227 13:38:14.357206 94 collection_manager.cpp:159] Found 2 collection(s) on disk.
I20230227 13:38:14.362864 131 collection_manager.cpp:83] Found collection products-8 with 4 memory shards.
I20230227 13:38:14.365340 130 collection_manager.cpp:83] Found collection products-7 with 4 memory shards.
I20230227 13:38:14.368235 130 collection_manager.cpp:1114] Loading collection products-7
I20230227 13:38:14.371126 131 collection_manager.cpp:1114] Loading collection products-8
E20230227 13:38:14.691952 85 backward.hpp:4199] Stack trace (most recent call last) in thread 85:
E20230227 13:38:14.701339 85 backward.hpp:4199] #5 Object "", at 0xffffffffffffffff, in
E20230227 13:38:14.702204 85 backward.hpp:4199] #4 Object "/lib/x86_64-linux-gnu/libc-2.23.so", at 0x7f84dcbb251c, in __clone
E20230227 13:38:14.702880 85 backward.hpp:4199] #3 Object "/lib/x86_64-linux-gnu/libpthread-2.23.so", at 0x7f84dd5916b9, in start_thread
E20230227 13:38:14.703495 85 backward.hpp:4199] #2 Source "../../../../../libstdc++-v3/src/c++11/thread.cc", line 80, in execute_native_thread_routine [0x14ae6cf]
E20230227 13:38:14.704129 85 backward.hpp:4199] #1 Source "/typesense/src/batched_indexer.cpp", line 123, in run [0x50cd91]
E20230227 13:38:14.704944 85 backward.hpp:4199] #0 Source "/typesense/include/store.h", line 169, in scan [0x512925]
Segmentation fault (Address not mapped to object [(nil)])
E20230227 13:38:15.616197 85 typesense_server.cpp:95] Typesense 0.23.1 is terminating abruptly.
Sergio
03:00 PMI20230227 14:21:17.644379 1 http_server.cpp:177] Typesense has started listening on port 8108
I20230227 14:21:17.644613 102 batched_indexer.cpp:120] Starting batch indexer with 16 threads.
I20230227 14:21:17.645495 102 batched_indexer.cpp:126] BatchedIndexer skip_index: -9999
And never gets "healthy"
Kishore Nallan
04:07 PMKishore Nallan
04:07 PMSergio
10:00 PMTypesense
Indexed 2786 threads (79% resolved)
Similar Threads
Troubleshooting Typesense Cluster Multi-node Leadership Error
Bill experienced a problem with a new typesense cluster, receiving an error about no leader and health status issues. Jason and Kishore Nallan provided troubleshooting steps and determined it was likely due to a communication issue between nodes. Kishore Nallan identified a potential solution involving resetting the data directory. Following this, Bill reported the error resolved.
Typesense Node Stuck in Segfault Loop After Stress Test
Adrian encountered a segfault loop when stress testing a Typesense cluster. Kishore Nallan recommended trying a newer RC build and suggested potential issues with hostname resolution.
Issues with Typesense and k8s Snapshot Restoration
Arnob experienced data loss and errors with Typesense in k8s. Kishore Nallan explained corruption could be from premature pod termination. To resolve, Kishore Nallan suggested deleting the data directory on the malfunctioning pod for automatic restoration from the leader.