#community-help

Troubleshooting Write Timeouts in Typesense with Large CSVs

TLDR Agustin had issues with Typesense getting write timeouts while loading large CSV files. Kishore Nallan suggested chunking data or converting to JSONL before loading. Through troubleshooting, they identified a possible network problem at AWS and found a workaround.

Powered by Struct AI

1

1

Jun 11, 2021 (30 months ago)
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
01:51 AM
Hi everyone! I'm using Typesense to index a large CSV of public records in an effort to enable government transparency.

For this purpose, I created a python script that loads a huge csv file, cleans it, separates it in chunks and imports records in parallel through the Typesense API. While the first few thousand records load correctly, I start getting write timeouts from the Typesense library seconds later:
ConnectionError: ('Connection aborted.', timeout('The write operation timed out'))

I tried retrying failing requests, but I can't even seem to catch the exceptions in the import_ function. My instance should have more than enough memory and CPU to handle everything. (10 cores/20 gb RAM with a 8gb dataset)

Any ideas?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:23 AM
1. What does the Typesense log say?
2. Are you using the import API? If so, you don't have to parallelize the writes: the API itself has batching parameter that allows you to send large data in.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:07 AM
1. (Edit: I'm uploading full logs)
2. Because the dataset is bigger than local memory (> 8 gb) I can't load it all at once, so I use Dask Dataframes to load out-of-core and operate on chunks.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:18 AM
Can you generate a single JSONL file for your dataset (convert your CSV into a single JSONL file) and then try importing that using a single API call?
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:19 AM
03:19
Agustin
03:19 AM
They don't show any error
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:20 AM
Yeah the logs look fine. What's the free memory now on the machine?
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:21 AM
And yes, I can probably to a middle step and convert first to a giant JSONL and then try the import.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:21 AM
I do see a response_abort called in the logs though.
03:21
Kishore Nallan
03:21 AM
That is logged when the import is aborted in the middle when the client disconnects.
03:22
Kishore Nallan
03:22 AM
I think you will have great success with a single JSONL file. Our import API is completely streaming and we regularly import several gigabytes of JSONL files into Typesense without any issues.

1

Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:23 AM
I won't need to load the JSONL file to memory, right?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:23 AM
The write timeouts could be an indication that the server is getting more writes than it can finish indexing within the default timeout interval. With a streaming import, we automatically handle this scenario.
03:24
Kishore Nallan
03:24 AM
> I won't need to load the JSONL file to memory, right?
Will depend on the http client used. It should be smart enough. Otherwise, just use CURL.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:26 AM
Seeing the API reference, it seems the Typesense library for Python would use batches while uploading JSONL, is that right?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:28 AM
The Python client does not accept a file name so it must be either a list of document objects or a stringified jsonl representation of the documents, neither of which will be streaming in nature.
03:28
Kishore Nallan
03:28 AM
So I suggest using CURL.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:29 AM
I understand. Thank you!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:29 AM
I have taken note to improve this aspect of the client.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:30 AM
My experience with Typesense has been awesome so far. We're migrating from AppSearch and we're just incredibly impressed by its performance.
03:30
Agustin
03:30 AM
Thank you for making the project open source!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:30 AM
Glad to hear, thank you for the feedback. Definitely want to keep making it better.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:31 AM
We would use Typesense Cloud, but we're an NGO with a very limited budget, so we're taking advantage of AWS credits at the moment. We will definitely consider migrating in the future.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:31 AM
Understood 👍
Ricardo
Photo of md5-a2785b9d22ba23f3627d4bd877e95e7c
Ricardo
08:11 AM
just as an aside I ran into similar yesterday and also had to chunk my upsert, which had 1000 documents, into something like 200 each chunk, for it to work.
08:12
Ricardo
08:12 AM
I am using a weak server, but my dataset is small.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:16 AM
Did you make a single import API call? Or ran imports parallely via code?
Ricardo
Photo of md5-a2785b9d22ba23f3627d4bd877e95e7c
Ricardo
09:20 AM
typesense_client.collections['collection'].documents.import_(documents, {'action': 'upsert'})
09:20
Ricardo
09:20 AM
that's what I do
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:20 AM
And the client times out?
Ricardo
Photo of md5-a2785b9d22ba23f3627d4bd877e95e7c
Ricardo
09:21 AM
yeah, and at times also gave me an error parsing the response from the server. tbh I chunked it and didn't look at it more, since that fixed it.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:22 AM
Client timeout can occur depending on the size and number of documents, in which case you can bump up the client timeout value. It should then complete gracefully. It's a client config change.
Ricardo
Photo of md5-a2785b9d22ba23f3627d4bd877e95e7c
Ricardo
09:23 AM
I did but it seemed to be timing out before my set timeout which was 10s
09:23
Ricardo
09:23 AM
anyway I can revisit it
09:23
Ricardo
09:23 AM
gonna have to make changes to my importer again
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:23 AM
If request is terminated abruptly then it can also cause parse error.
09:24
Kishore Nallan
09:24 AM
If you can reproduce it consistently on a dataset you can share then I will be happy to troubleshoot.
Ricardo
Photo of md5-a2785b9d22ba23f3627d4bd877e95e7c
Ricardo
09:32 AM
yeah I can share this dataset, I will take a look at it next time I am working on it and provide you with the dataset and the code that imports it.

1

sonu
Photo of md5-6ade4a341436f96c87480052a1584bf3
sonu
02:21 PM
i too had to increase my timeout to 10s for a 1000 doc import ..sometimes it would just drop few files... anyway increasing timeout fixed it
Jun 13, 2021 (29 months ago)
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
02:18 AM
I generated JSONL files and tried curl, but now I get "curl: (18) transfer closed with outstanding read data remaining" from curl every time I upload a file.
02:18
Agustin
02:18 AM
Kishore Nallan
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:21 AM
Can you please tell me:

a) How many lines does the generated JSONL file contain
b) What exact CURL command you are using?
c) How long does the CURL command run before getting this error?
d) After the curl command fails, how many records were imported successfully on Typesense?
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
02:28 AM
a) 229061 (each jsonl file)
b)
curl -H "X-TYPESENSE-API-KEY: $TYPESENSE_API_KEY" -X POST --data-binary @$FILE \
      "http://$ENDPOINT/collections/revenue_entry/documents/import?action=create"

c) 1m 15s
d) Around 70k
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:31 AM
Got it. Also, what is the size of the file with 229061 lines?
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
02:31 AM
170MB
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:32 AM
Hmm :thinking_face: That is perfectly fine and it should not time out this way... We regularly import files with millions of records for our demo. Are you importing these files from you local machine to remote machine? Or are you running the imports within the remote machine? Can you try doing the latter to see if it helps?
02:33
Kishore Nallan
02:33 AM
Basically, try copying a file onto the remote machine and then running the import command locally on that machine to see if it helps.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
02:34 AM
Local to remote, less than 80 ms latency to my container on AWS.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:35 AM
Okay, let's rule out any networking quirks between local to remote first.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
02:35 AM
There's an AWS Application Load Balancer in front of the container. I'll check its configuration.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:37 AM
If it repeatedly fails exactly at the 1min 15 secs interval, there might be some form of timeout configuration somewhere.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
02:42 AM
Can't find anything on the AWS ALB config.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:43 AM
I think if you just copied the file inside the instance and ran the import, we can rule this out easily. If the issue occurs again, then we know it is not network related.
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:19 AM
Kishore Nallan I just uploaded the file from an instance in the same internal network as the container and I got the same result.
03:22
Agustin
03:22 AM
Wait, I think using a different subnet worked.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:25 AM
Is Typesense running inside Docker on a plain EC2 instance?
Agustin
Photo of md5-8564a42449c5b0144c2f3d3f1cab883a
Agustin
03:35 AM
It's running on Docker with Fargate.
03:36
Agustin
03:36 AM
For now my workaround would be to upload through SFTP to a secondary instance and upload through the internal network. Still not sure what's causing the limit, but it's definitely a network component in AWS.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:57 AM
Ok glad to hear that it works with the work around. The application load balancer might have a default idle timeout which might have to be increased.