#community-help

Issues with Large Index Export via Typesense Server

TLDR Mojan struggled with a large index export on Typesense server. Kishore Nallan informed of the issue's fix in an upcoming server version. The issue stemmed from the Python client, not the server.

Powered by Struct AI

1

1

18
2mo
Solved
Join the chat
Jul 06, 2023 (3 months ago)
Mojan
Photo of md5-813f8f14e16940e952eb3fe25d0c4523
Mojan
03:33 AM
Hey guys,
I'm trying to get an export of a fairly big index and it overwhelms the Typesense server . Wondering if export can be done in batches like import. Tried feeding it batch_size but it seems to not read it. Any ideas ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:34 AM
๐Ÿ‘‹ This is fixed in the most recent 0.25 RC build (0.25.0.rc44)
Mojan
Photo of md5-813f8f14e16940e952eb3fe25d0c4523
Mojan
03:38 AM
I'm using Typesense Python library. Is 0.25 released yet ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:39 AM
I'm talking about the server version. We are in the last stretch before release of 0.25

1

03:39
Kishore Nallan
03:39 AM
You can use the rc build, it's stable
Mojan
Photo of md5-813f8f14e16940e952eb3fe25d0c4523
Mojan
03:39 AM
Thank you, Kishore .

1

07:00
Mojan
07:00 PM
Hey Kishore,

I am trying the RC44 build and I can confirm it still falls over when exporting documents.
Which branch is it on GH ? I can take a look to see the integration.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
10:46 PM
I identified one more issue related to this and fixed it again. Can you try on RC45? I've tested this on a large 12GB dataset
10:48
Kishore Nallan
10:48 PM
And when you mean by fall over, what exactly happens? The issue I fixed is limiting the memory used during export. Previously it was using a huge buffer so it ate up a lot of memory.
Jul 07, 2023 (3 months ago)
Mojan
Photo of md5-813f8f14e16940e952eb3fe25d0c4523
Mojan
02:56 AM
Thanks for the update. I tried RC45 but it still not works. So what happens is that the connection gets closed. I'm going to attach screenshots of the error for you below :
P.S. I'm using Docker.
Image 1 for Thanks for the update. I tried RC45 but it still not works. So what happens is that the connection gets closed. I'm going to attach screenshots of the error for you below :
 P.S.  I'm using Docker.Image 2 for Thanks for the update. I tried RC45 but it still not works. So what happens is that the connection gets closed. I'm going to attach screenshots of the error for you below :
 P.S.  I'm using Docker.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:58 AM
What's your client side timeout?
03:02
Kishore Nallan
03:02 AM
After how long does this happen? Can you also post the code snippet for export? I will try it out.
Mojan
Photo of md5-813f8f14e16940e952eb3fe25d0c4523
Mojan
06:03 PM
My client-side timeout is 120000.
06:04
Mojan
06:04 PM
The error happens after about 10 minutes.
 indexed_pages = client.collections['pages'].documents.export({'batch_size': 10})
06:10
Mojan
06:10 PM
I am running this on a table of 16.41 Gb, btw.
Jul 08, 2023 (3 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:47 AM
I tested exporting a 11G dataset on this build the other day. I ran using curl which has no timeouts. Can you try once via curl and without using the batch size option?
Jul 10, 2023 (2 months ago)
Mojan
Photo of md5-813f8f14e16940e952eb3fe25d0c4523
Mojan
07:36 PM
Hi Kishore,

The curl for export seems to be working fine.
Jul 11, 2023 (2 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:58 AM
Then I think the issue is with the python client.