Issues with Large Index Export via Typesense Server
TLDR Mojan struggled with a large index export on Typesense server. Kishore Nallan informed of the issue's fix in an upcoming server version. The issue stemmed from the Python client, not the server.
1
1
Jul 06, 2023 (3 months ago)
Mojan
03:33 AMI'm trying to get an
export
of a fairly big index and it overwhelms the Typesense server . Wondering if export can be done in batches like import. Tried feeding it batch_size
but it seems to not read it. Any ideas ?Kishore Nallan
03:34 AMMojan
03:38 AMKishore Nallan
03:39 AM1
Kishore Nallan
03:39 AMMojan
03:39 AM1
Mojan
07:00 PMI am trying the RC44 build and I can confirm it still falls over when exporting documents.
Which branch is it on GH ? I can take a look to see the integration.
Kishore Nallan
10:46 PMKishore Nallan
10:48 PMJul 07, 2023 (3 months ago)
Mojan
02:56 AMP.S. I'm using Docker.
Kishore Nallan
02:58 AMKishore Nallan
03:02 AMMojan
06:03 PMMojan
06:04 PM indexed_pages = client.collections['pages'].documents.export({'batch_size': 10})
Mojan
06:10 PMJul 08, 2023 (3 months ago)
Kishore Nallan
12:47 AMJul 10, 2023 (2 months ago)
Mojan
07:36 PMThe curl for export seems to be working fine.
Jul 11, 2023 (2 months ago)
Kishore Nallan
12:58 AMTypesense
Indexed 2779 threads (79% resolved)
Similar Threads
Troubleshooting Write Timeouts in Typesense with Large CSVs
Agustin had issues with Typesense getting write timeouts while loading large CSV files. Kishore Nallan suggested chunking data or converting to JSONL before loading. Through troubleshooting, they identified a possible network problem at AWS and found a workaround.
Resolving JSONL File Import Issues in Python
Jon struggles importing a large JSONL file using Python, encountering decode errors and size restrictions. Kishore Nallan instructs to use curl for imports under 10GB, and references an update to the Python client which could more capably handle large imports.
Issues with Importing Typesense Collection to Different Server
Kevin had problems migrating a Typesense collection between Docusaurus sites on different machines. Jason advised them on JSONL format, handling server hosting, and creating a collection schema before importing documents, leading to successful import.
Bulk Import 50MB JSON Files Error - Timeout and Solutions
madhweep encounters an error while bulk importing JSON files. Kishore Nallan provided help, but the issue persists. Jason intervenes and after troubleshooting, they concluded the cluster had run out of memory causing the issue. The problem was resolved by using a cluster with sufficient memory. Daniel also experienced a similar issue, resolved by increasing the timeout.
Resolving Typesense Documents Import Error
Aljosa experienced an error while using typesense `documents().import()`, related to handling of large document arrays. Jason clarified that batch_size controls server-side batching, not client-side. He advised splitting arrays to address the issue and committed to elaborating its functionality in the docs. Aljosa proposed amending the TypeScript types to accommodate batch_size in the import options.