#community-help

Bulk Indexing Issue with 3 Million Posts

TLDR Digamber is facing a cURL error 28 while bulk indexing 3 million posts. Kishore Nallan suggests checking post body size and possibly encountering a 503 status code.

Powered by Struct AI
Mar 09, 2023 (7 months ago)
Digamber
Photo of md5-a0246423746b3b51425d05cfd9c494ae
Digamber
11:49 AM
Hi Guys,
I have 3 million posts, and i’m bulk indexing and during the indexing process i run tino cURL error 28,
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:50 AM
The post body might be too large
Digamber
Photo of md5-a0246423746b3b51425d05cfd9c494ae
Digamber
11:51 AM
The thing is it bails after different times
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:52 AM
What's the size of the file being posted?
Digamber
Photo of md5-a0246423746b3b51425d05cfd9c494ae
Digamber
11:52 AM
Additionally - the current code is doing a request to /health to check condition
11:52
Digamber
11:52 AM
that’s when it’s being bailed
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:53 AM
Are you using the php client?
Digamber
Photo of md5-a0246423746b3b51425d05cfd9c494ae
Digamber
11:53 AM
No we’re making cURL requests
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:53 AM
So the thing is if you write too fast, Typesense can return 503 status code as a backpressure mechanism to let writes catch up
11:54
Kishore Nallan
11:54 AM
If you get a 503, backoff and retry after a short interval.
Digamber
Photo of md5-a0246423746b3b51425d05cfd9c494ae
Digamber
11:57 AM
ok let me see if i can reduce the payload and test it out again.
Also - not sure if i’m getting a 503 error - will verify and check
12:06
Digamber
12:06 PM
Also can you let me know how to measure payload size here ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:07 PM
The raw jsonl post you are passing. Max payload size is 10 G so I don't think you are running into that.
Digamber
Photo of md5-a0246423746b3b51425d05cfd9c494ae
Digamber
12:07 PM
yeah - ok, that’s probably not it - no way i’m going over 10G
12:07
Digamber
12:07 PM
will do some more test