Resolving Timeout Issues in Bulk Document Imports
TLDR Aljosa encountered timeouts when importing 32k documents. Kishore Nallan advised increasing client side timeout. Bruno suggested import in chunks and a retry for individual chunks. Kishore Nallan mentioned the next release of Typesense will make imports more reliable.
1
Sep 22, 2021 (28 months ago)
Aljosa
03:59 PMI've implemented a rebuild process for all documents when I update a collection schema.
So in my backend, I call the updateSchema endpoint which drops the collection, creates a new one with the schema (I should use aliases but just found out about them) and then retrieves all items from the DB to reindex them with the new schema.
Currently the DB has 32k items but will have millions in production. When I send all 32k using documents().import() , it fails with timeouts. Sending batches of approx 10k to the same call works.
I'm not sure why it makes a difference since the import call already uses batching.
Is there an upper limit to how many documents you can send with import() ?
Kishore Nallan
04:02 PMKishore Nallan
04:02 PMBruno
04:35 PMAljosa
06:27 PMBruno understood, I think I'm gonna keep it chunked (I have clear sections in my data anyway that are anywhere from 2k to 20k rows) and add the retry for individual chunks if ever I run into a timeout.
Thank you both
1
Sep 23, 2021 (28 months ago)
Kishore Nallan
12:29 AMTypesense
Indexed 3015 threads (79% resolved)
Similar Threads
Troubleshooting Write Timeouts in Typesense with Large CSVs
Agustin had issues with Typesense getting write timeouts while loading large CSV files. Kishore Nallan suggested chunking data or converting to JSONL before loading. Through troubleshooting, they identified a possible network problem at AWS and found a workaround.
Troubleshooting Indexing Duration in Typesense Import
Alan asked about lengthy indexing times for importing documents to Typesense. Jason suggested various potential causes, including network connectivity and system resources. They later identified the problem to be an error in Alan's code.
Resolving Timeout Errors in Large Document Imports
Ken had issues with importing over 360k documents due to operation timeout. Jason advised increasing the timeout in the client library.