Resolving JSONL File Import Issues in Python
TLDR Jon struggles importing a large JSONL file using Python, encountering decode errors and size restrictions. Kishore Nallan instructs to use curl for imports under 10GB, and references an update to the Python client which could more capably handle large imports.
Jan 08, 2023 (9 months ago)
Jan 09, 2023 (9 months ago)
Kishore Nallan01:06 AM
Kishore Nallan04:20 AM
Kishore Nallan04:32 AM
This is available in
0.15.0version of the Python client that I've just published.
Jan 16, 2023 (9 months ago)
Jan 17, 2023 (9 months ago)
Kishore Nallan03:24 PM
Kishore Nallan03:25 PM
Kishore Nallan03:29 PM
Kishore Nallan03:29 PM
Kishore Nallan03:32 PM
Kishore Nallan03:34 PM
curlwill work fine as long as POST data is less than 10 GB. So if your total dataset size is 28 GB, you will need to split into 3 files.
Indexed 2779 threads (79% resolved)
Issues with Importing Typesense Collection to Different Server
Kevin had problems migrating a Typesense collection between Docusaurus sites on different machines. Jason advised them on JSONL format, handling server hosting, and creating a collection schema before importing documents, leading to successful import.
Troubleshooting Write Timeouts in Typesense with Large CSVs
Agustin had issues with Typesense getting write timeouts while loading large CSV files. Kishore Nallan suggested chunking data or converting to JSONL before loading. Through troubleshooting, they identified a possible network problem at AWS and found a workaround.
Bulk Import 50MB JSON Files Error - Timeout and Solutions
madhweep encounters an error while bulk importing JSON files. Kishore Nallan provided help, but the issue persists. Jason intervenes and after troubleshooting, they concluded the cluster had run out of memory causing the issue. The problem was resolved by using a cluster with sufficient memory. Daniel also experienced a similar issue, resolved by increasing the timeout.
Troubleshooting Typesense Document Import Error
Christopher had trouble importing 2.1M documents into Typesense due to memory errors. Jason clarified the system requirements, explaining the correlation between RAM and dataset size, and ways to tackle the issue. They both also discussed database-like query options.
Discussion on Document Inserting Speed and Process
David inquired about document insertion speed, and Jason provided reference values and recommended sending more documents per API call. Both David and Chetan acknowledged the suggestions, with David stating to report back on their experience.