Hi, I was wondering if there is a channel to like ...
# community-help
t
Hi, I was wondering if there is a channel to like unblock an IP in Typesense Cloud? I’ve been doing a lot of local stuff with my cluster as we’ve gone through a pretty big data migration, and I have gotten my IP like fully blocked from the cluster it seems like, and there’s no way to unlock it without just rolling a new cluster. It’s not like I can even wait 5 minutes, an hour, 24 hours, like my card just gets pulled and I can’t maintain it anymore.
j
Could you show me a screenshot of the exact error you’re seeing?
t
Copy code
Request #1745500225908: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1745500225908: Sleeping for 0.1s and then retrying request...
Request #1745500226030: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1745500226030: Sleeping for 0.1s and then retrying request...
Request #1745500226037: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1745500226037: Sleeping for 0.1s and then retrying request...
Request #1745500226204: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1745500226204: Sleeping for 0.1s and then retrying request...
Request #1745500226333: Request to Node 0 failed due to "ECONNRESET socket hang up"
Request #1745500226333: Sleeping for 0.1s and then retrying request...
^C
I first hit this on some test clusters when I was doing some mass deletions, and I didn’t realize I could delete via query, and so I was firing like 100 requests at once to delete single records. I’ve since been using delete by query.
This morning, I was running a few different directions of backfill at the same time and hit it for the first time in prod. It’s fine in the staging environments because I could just clone and try to be more responsible next time, but here I can’t be locked too far out.
j
We don’t block IPs or rate limit your cluster. This sounds more like your cluster has run out of capacity. Could you check your CPU usage
t
I would think so, but this happens 24 hours later, and we’ll see search running just fine.
j
Are you using a burst cpu type?
t
Yes
j
Ah ok then you most likely ran out of burst cpu capacity
❤️ 1
You want to try upgrading to a non-burst cpu type, which comes with a minimum of 4GB RAM
t
Well, maybe we were. We switched to HA, I don’t see it as an option anymore
We were previously 4GB with 8GB burst, but now that we’re HA, I don’t see burst as an option at all for the cluster anymore
Not GB. We’re a 64GB HA cluster, with 4 vCPU in HA. Sorry, I suck at detailing specs correctly
Our CPU consumption is pretty high, but I don’t think we’ve burned it down or anything:
So yeah, we’re a 64GB HA cluster with v4 CPU. I don’t think we’re burst type because I don’t think its an option at that scale. I can provide the cluster id. Search works, but any time I try to connect locally, I get socket hang ups.
@Jason Bosco - I was able to make some of these changes by screensharing with a colleague while having them run my scripts. I think some sort of logical barrier thing is treating me as a bot swarm.
Oh, I’m good too! Cool! Okay, maybe it was something with CPUs. Uh, thank you, my mistake!
j
If you're sending requests directly to the hostnames you see on your dashboard (and not proxying it via Cloudflare for eg), then we don't add any other IP-based rate limits on the Typesense side. But what I do notice in your cluster is that the number of TCP connections progressively increases as the writes come in. So I suspect this might be a case of something triggering high concurrency writes. You want to limit the number of batch import API calls you do to a max of
N-1
, where N is the number of cores on your cluster and then increase the number of documents you send within each API call. If you're already doing this, then another source of the issue could be that you have a short connection timeout configured in the client that does the import, so connections are timing out before the import is fully complete and so the client ends up terminating an in-progress import and retrying it on a new connection. This results in a thundering herd issue
t
@Jason Bosco - Thank you for the excellent help here, as always. I haven’t encountered this since and I think sticking to responsible API use patterns has worked well enough.
j
Awesome! 🙌
🙌 1