Hey there :man-raising-hand: I've implemented a r...
# community-help
a
Hey there 🙋‍♂️ I've implemented a rebuild process for all documents when I update a collection schema. So in my backend, I call the updateSchema endpoint which drops the collection, creates a new one with the schema (I should use aliases but just found out about them) and then retrieves all items from the DB to reindex them with the new schema. Currently the DB has 32k items but will have millions in production. When I send all 32k using documents().import() , it fails with timeouts. Sending batches of approx 10k to the same call works. I'm not sure why it makes a difference since the import call already uses batching. Is there an upper limit to how many documents you can send with import() ?
k
👋 I think you are running into client side timeout during import. For example, if you use curl it should import the whole set without issues.
You can increase the timeout in the client configuration.
b
My two cents: it's probably good practice to send the items in chunks anyway, else you lose the whole import if there's any hiccup whatsoever, and you can implement a retry with upsert.
a
@Kishore Nallan makes sense! I have the default 2 seconds from the docs, I'll try to bump that up. Thank you for your advice. @Bruno Ferreira understood, I think I'm gonna keep it chunked (I have clear sections in my data anyway that are anywhere from 2k to 20k rows) and add the retry for individual chunks if ever I run into a timeout. Thank you both
👍 1
k
In the next release of Typesense (which you can already preview if you want via a RC build), we have made the imports atomic. So if a client times out or disconnects, partial updates don't creep it. This should make imports more reliable.