#community-help

Discussing Typesense Search Request Performance

TLDR Al experienced longer-than-reported times for Typesense search requests, sparking a detailed examination of json parsing, response times and data transfer. Jason and Kishore Nallan helped solve the issue.

Powered by Struct AI

1

1

37
30mo
Solved
Join the chat
May 17, 2021 (31 months ago)
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
07:06 PM
Hi all (also Kishore !), I've set up typesense, I already indexed a collection with about 20k objects, the schema is a single id and text field which is the one indexed.

Long thread incoming, but you would probably appreciate my findings.

I have noticed that, when running multiple search requests in a single one (i.e. multisearch) the time it takes to complete the requests does not match the time reported by the individual search_time_mss.
07:07
Al
07:07 PM
Check out this pic:
07:07
Al
07:07 PM
1st col is the text query, 2nd col is the value reported by search_time_ms for each one, 3rd col is the actual time as measured in my code.

And the final value is the time it took for everything to run.

If you take the 2nd column and sum all the values, you would get something like 200 ms, but for whatever reason, the whole thing takes about twice the time to execute.
07:11
Al
07:11 PM
I am running typesense in localhost so It's not a network roundtrip issue, BUT, I still think it is related to how long it takes to transport all the data from ts -> to my client. I am looking at the typical search response and it is quite verbose, is there a way to disable some of the fields in the search result? Like the snippets, which I don't need.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:53 PM
Al The difference between col1 and col2 seems to be around 3-6ms. This additional time most likely comes from http layer parsing, tiny tcp overhead and actual data transfer (like you mentioned). The search time excludes these.

You could disable some document fields from being returned using the exclude_fields search parameter, but there's no way to disable snippets at the moment.
09:54
Jason
09:54 PM
There's this Github issue we're tracking to disable highlights: https://github.com/typesense/typesense/issues/260

1

Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
10:56 PM
Following it, thanks!
10:59
Al
10:59 PM
Jason Ok, FWIW, the same queries, same fields, same data, same server, takes about 150ms to run on meilisearch. Each query takes about the same as in typesense, but without the extra overhead. If you find a way to speed this up it would be a nice performance improvement, I wish I could help more, as I am liking TS a lot, so far.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:02 PM
I see, I wonder if this is specific to multi-search, since there's some JSON parsing involved there on the server side. If it's not too much trouble, would you be able to repeat this test with the single search (documents/search) endpoint?
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
11:03 PM
Yes, I've just finished that about an hour ago 😄

One path uses multisearch, the other is just Promise.all(<with a lot of single searches>) (so they run concurrently)

Results are pretty much the same on both paths. Same time per query, same overhead, same time overall (+- 20ms or so).
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:11 PM
Got it, would you be able to share this dataset (say via email)? I used a 2M recipes dataset recently to benchmark Typesense and Meilisearch and that dataset showed faster search response times with Typesense consistently. So I wonder if this is something specific to this dataset...
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
11:26 PM
Sure, is JSON fine with you? I can dump the documents array, it's exactly the same on both ms and ts.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:26 PM
Yup, JSON is perfect, thank you!
11:26
Jason
11:26 PM
Oh and also the collection schema you used
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
11:27 PM
Sure, what's your email?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:27 PM
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
11:27 PM
Ok, great give me a few mins.

1

11:43
Al
11:43 PM
Email sent 👌
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:45 PM
Thank you! Will take a look
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
11:46 PM
Thank you!
May 18, 2021 (31 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
05:21 AM
I will be taking a look at this today. Can you please tell me what version of Typesense you are using locally?
06:06
Kishore Nallan
06:06 AM
I just tried it on the dataset you shared (with 42K records), and I am not able to reproduce the latency. When I use curl + timing, the entire query including the response finishes in about 74 ms . This is on Typesense v0.20.0 Docker image. I will email you the query snippet I used so that we can compare results.
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
06:16 AM
Hi again Kishore, ok. So, you mean the whole multisearch 'query' right (all of them bundled)?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:16 AM
Yup check your email for the exact snippet.
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
06:17 AM
I left out a few in my pic, but still you should be able to see a big difference between the sum of all search_time_ms and the measured time for the whole thing (or not?).
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:22 AM
I am measuring the end to end curl request time taken. The gist has the curl request I'm sending..if you can run the same on your localhost we can compare the times.
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
06:23 AM
Ok, let me check.
06:48
Al
06:48 AM
Hi, kishore, I got your email but I'm really tired now (2am), I'll go sleep a bit and come back tomorrow with the results you asked. Have good day/night!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:50 AM
No problem, good night!
May 19, 2021 (31 months ago)
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
02:08 AM
Hi Kishore, I sent you an email some time ago, but I just wanted to add something. I've also wrote some code in node.js that makes and http request to the multi_search endpoint and parses the result. Measured times are virtually identical vs. using the library for the same thing. So I guess the overhead I'm seeing is just data transfer + json parsing, not much that could be donde regarding that. The response from the servers for all my queries combined is about 2Mb, so that explains it. I'll wait until there's a way to disable fields from the response, that's the bottleneck in my case.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:16 AM
Got it. Have you tried using exclude_fields?
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
02:28 AM
Let me see what happens if I remove the 'text' field which is the heaviest.
03:31
Al
03:31 PM
(forgot to come back to that) it doesn't make big difference apparently
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:32 PM
Has the size of response dropped when you excluded the field?
03:35
Kishore Nallan
03:35 PM
But it certainly it seems like the issue is with client. Either json parsing or something else.
May 20, 2021 (30 months ago)
Al
Photo of md5-cb22c78df2923d7b0b3747de058b5438
Al
02:42 PM
Yes, it dropped, but processing time didn't change significantly.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:51 PM
Got it. We will be now looking at it from a json parsing angle. Will keep you posted.