#community-help

Troubleshooting Unhealthy Typesense Server

TLDR Akash reported a problem with a Typesense server. Kishore Nallan suggested upgrading the CPU limit, upgrading to version 0.23, and changing to a 3 node 2 cpu burst configuration. They also addressed how to increase client timeout.

Powered by Struct AI
17
9mo
Solved
Join the chat
Dec 31, 2022 (9 months ago)
Akash
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Akash
01:09 PM
Jason Typesense server shows unhealthy and schema is not showing and working
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:14 PM
This time the cluster is running into CPU limits. Given your search volume, you probably need a larger cluster and also use HA configuration. You can look at the metrics on Typesense Cloud to do basic debugging. If your memory or cpu thresholds are breached, you have to launch a new cluster or schedule an upgrade pre-emptively. While we react swiftly to general system-wide cluster stability issues, these operational issues due to resource constraints are out of our hands.
Akash
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Akash
01:22 PM
please double the cpu limit i thought , 15 request persec is search
01:25
Akash
01:25 PM
As i suggest you increase 2-3 time CPU limit increase
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:25 PM
A heavy query can sometimes lockup the CPU. I've restarted Typesense to give the cluster some breathing space. I also recommend testing your app against v0.23 and upgrade to that.
01:26
Kishore Nallan
01:26 PM
The cluster is back up. Doing another upscale will require another downtime since you don't use a HA configuration. My recommendation is to launch a new cluster as per your desired size, reindex your data and switch your application to the new search end-point.
Akash
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Akash
01:34 PM
please double the cpu size
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:39 PM
Started the upgrade to 4 dedicated cpu configuration. ETA 10-15 mins.
Akash
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Akash
01:40 PM
ok
01:57
Akash
01:57 PM
we are using react instant search multiple time call when API call timeout
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:58 PM
Try increasing your client timeout, because otherwise this can cause a thundering herd of retries.
02:00
Kishore Nallan
02:00 PM
Scaled up node is back up. You should migrate to a HA configuration for production workloads.
Akash
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Akash
02:01 PM
'connection_timeout_seconds': 10
02:03
Akash
02:03 PM
how incerase client timeout
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:04 PM
Right now you are on 8 gb / 4 dedicated cpu configuration. Moving to a 3 node 2 cpu burst configuration will be better for you
02:04
Kishore Nallan
02:04 PM
10 sec timeout is enough. CPU probably wasn't enough to serve your current load. Hopefully 4 dedicated cpu will hold up better.
02:05
Kishore Nallan
02:05 PM
You should also upgrade to v0.23