Resolving Typesense Documents Import Error
TLDR Aljosa experienced an error while using typesense
documents().import(), related to handling of large document arrays. Jason clarified that batch_size controls server-side batching, not client-side. He advised splitting arrays to address the issue and committed to elaborating its functionality in the docs. Aljosa proposed amending the TypeScript types to accommodate batch_size in the import options.
Nov 08, 2021 (26 months ago)
With typesense.js 0.14.0 I was using batch_size as an option as described in the doc but in 1.0.0 with typings, batch_size is not an accepted option. Anyways, reducing it to 1 or using the default 40 didn't matter. The only way I was able to resolve it was by literally splitting the array in half and doing two imports one after the other
The solution for such large imports would be to convert to JSONL, and then send that JSONL string into the import method and you won't run into this issue.
batch_sizeparameter you mention is actually a Typesense server parameter which does something different - server-side batching, after ever X documents imported, it will pause and look at the search request queue services those and then get back to importing.
I see that actually in typesense js it's converted to JSONL anyways https://github.com/typesense/typesense-js/blob/a21d4101bc21fe59e0e85b41e64ba14d6fe88667/src/Typesense/Documents.ts#L112
batch_sizeparameter, I believe this is a documentation bug then ?
Also notice how that method calls JSON.stringify on the entire array object. That's what causes the issue. One thing we could do is to split large arrays into smaller ones, then call JSON.stringify on them indvidiually and then concat them together. So users of the client library don't have to do this themselves...
The parameter still works from Typesense Server's perspective, we need to add it to Typescript types and clarify in the docs what it exactly means. It doesn't control client-side batching, only server-side batching
I guess I must be close to the limit with the additional manipulations I do on the raw json I post to my server since I'm able to split the json using the same
.map()used in typesense js
> The parameter still works from Typesense Server's perspective, we need to add it to Typescript types and clarify in the docs what it exactly means. It doesn't control client-side batching, only server-side batching
Understood 🙂 , I know what you mean now with regard to my initial question being about batching before sending.
And I appreciate the issue having been created! Will
batch_sizeactually be sent correctly then to the server if I add it to the options of the
Yup it should be sent. Do you want to create PR adding this to the types?
DocumentWriteParametersand add batch_size to that
Indexed 3005 threads (79% resolved)
Revisiting Typesense for Efficient DB Indexing and Querying
kopach experienced slow indexing and crashes with Typesense. The community suggested to use batch import and check the server's resources. Improvements were made but additional support was needed for special characters and multi-search queries.
Troubleshooting Typesense Document Import Error
Christopher had trouble importing 2.1M documents into Typesense due to memory errors. Jason clarified the system requirements, explaining the correlation between RAM and dataset size, and ways to tackle the issue. They both also discussed database-like query options.
Troubleshooting Indexing Duration in Typesense Import
Alan asked about lengthy indexing times for importing documents to Typesense. Jason suggested various potential causes, including network connectivity and system resources. They later identified the problem to be an error in Alan's code.
Typesense Server Bulk Import/Upsert Issue Resolved
Adam was confused about the discrepancy between the successful responses and the actual indexed data while working with a custom WP plugin integrating with Typesense. The issue was a bug related to fetching documents in the wrong order, not a Typesense problem.
Errors in Batch Import with Typesense and OpenAI API
Gustavo encountered errors when importing documents into a collection. After discussion with Jason, it was concluded that the issue stemmed from OpenAI API's handling of batch requests with problematic documents, and improvements to Typesense's error messages and handling were suggested.