Hi everyone, need some urgent help. Two of the thr...
# community-help
v
Hi everyone, need some urgent help. Two of the three nodes in our cluster are not up - one is out of memory and one is unhealthy. It has been a while now (>20 minutes) but they haven't come up yet. This is the cluster id:
jot73bcydg41v5w8p
. any help is very appreciated as this is a production cluster. Sorry for the tag, but @Kishore Nallan / @Jason Bosco any help is appreciated in figuring out why this happened as well .We had a spike of requests at that time but not too high of a spike either.
k
Checking
v
Hi @Kishore Nallan, can I upgrade our cluster? Will it interfere with your debugging?
k
2/3 nodes are okay but your client is not distributing the requests across both of them. Only 1 node being hit.
v
So we hit the load balanced end point directly. Our client is very dumb
k
Third node is also up.
We discussed this before. If you are doing requests from your server and reusing same client, then dns could be getting cached heavily so only 1 node will be hit.
Our own TTL is super low, if you check dig, 10 seconds.
v
I thought DNS caching was a typesense side issue, didn't realise it was a client side issue. What do you propose here then? Should I do round robin on our end?
k
For your server side use case, remove the load balanced dns and use the explicit per-node DNSes. The other option is to create a pool of clients and rotate them randomly every X requests so that the DNS resolutions are re-run and there is a greater chance of spread.
v
Got it, thank you for the help again! Just wanted to confirm, this isn't due to the facetting issue like last time?
k
There was a huge spike in search requests and this caused search-time working memory to shoot up.
But let me still check what happened here.
n
Hi @Kishore Nallan sorry just want to understand this a bit more. Our server is hitting the Typesense LB. Internally LB would be using IP addresses to route requests to one of the nodes. Where is the DNS caching coming into picture here?
k
We use DNS based load balancing. The load balanced end-point returns several IPs. We use DNS because we support SDN which requires routing requests to node nearest to user etc. Standard LB setup won't work and also incurs additional latency due to an extra hop.
n
Thanks