Hi all also Kishore I ve set up typesense I already indexed typesense #community-help

Hi all (also Kishore !), I've set up typesense, I ...

Al Mo

05/17/2021, 7:06 PM

Hi all (also Kishore !), I've set up typesense, I already indexed a collection with about 20k objects, the schema is a single id and text field which is the one indexed. Long thread incoming, but you would probably appreciate my findings. I have noticed that, when running multiple search requests in a single one (i.e. multisearch) the time it takes to complete the requests does not match the time reported by the individual search_time_mss.

Al Mo

05/17/2021, 7:07 PM

Check out this pic:

Al Mo

05/17/2021, 7:07 PM

1st col is the text query, 2nd col is the value reported by search_time_ms for each one, 3rd col is the actual time as measured in my code. And the final value is the time it took for everything to run. If you take the 2nd column and sum all the values, you would get something like 200 ms, but for whatever reason, the whole thing takes about twice the time to execute.

Al Mo

05/17/2021, 7:11 PM

I am running typesense in localhost so It's not a network roundtrip issue, BUT, I still think it is related to how long it takes to transport all the data from ts -> to my client. I am looking at the typical search response and it is quite verbose, is there a way to disable some of the fields in the search result? Like the snippets, which I don't need.

Jason Bosco

05/17/2021, 9:53 PM

@Al Mo The difference between col1 and col2 seems to be around 3-6ms. This additional time most likely comes from http layer parsing, tiny tcp overhead and actual data transfer (like you mentioned). The search time excludes these. You could disable some document fields from being returned using the

exclude_fields

search parameter, but there's no way to disable snippets at the moment.

Jason Bosco

05/17/2021, 9:54 PM

There's this Github issue we're tracking to disable highlights: https://github.com/typesense/typesense/issues/260

👍 1

Al Mo

05/17/2021, 10:56 PM

Following it, thanks!

Al Mo

05/17/2021, 10:59 PM

@Jason Bosco Ok, FWIW, the same queries, same fields, same data, same server, takes about 150ms to run on meilisearch. Each query takes about the same as in typesense, but without the extra overhead. If you find a way to speed this up it would be a nice performance improvement, I wish I could help more, as I am liking TS a lot, so far.

Jason Bosco

05/17/2021, 11:02 PM

I see, I wonder if this is specific to multi-search, since there's some JSON parsing involved there on the server side. If it's not too much trouble, would you be able to repeat this test with the single search (documents/search) endpoint?

Al Mo

05/17/2021, 11:03 PM

Yes, I've just finished that about an hour ago 😄 One path uses multisearch, the other is just Promise.all(<with a lot of single searches>) (so they run concurrently) Results are pretty much the same on both paths. Same time per query, same overhead, same time overall (+- 20ms or so).

Jason Bosco

05/17/2021, 11:11 PM

Got it, would you be able to share this dataset (say via email)? I used a 2M recipes dataset recently to benchmark Typesense and Meilisearch and that dataset showed faster search response times with Typesense consistently. So I wonder if this is something specific to this dataset...

Al Mo

05/17/2021, 11:26 PM

Sure, is JSON fine with you? I can dump the documents array, it's exactly the same on both ms and ts.

Jason Bosco

05/17/2021, 11:26 PM

Yup, JSON is perfect, thank you!

Jason Bosco

05/17/2021, 11:26 PM

Oh and also the collection schema you used

Al Mo

05/17/2021, 11:27 PM

Sure, what's your email?

Jason Bosco

05/17/2021, 11:27 PM

email is contact@typesense.org

Al Mo

05/17/2021, 11:27 PM

Ok, great give me a few mins.

🙏 1

Al Mo

05/17/2021, 11:43 PM

Email sent 👌

Jason Bosco

05/17/2021, 11:45 PM

Thank you! Will take a look

Al Mo

05/17/2021, 11:46 PM

Thank you!

Kishore Nallan

05/18/2021, 5:21 AM

I will be taking a look at this today. Can you please tell me what version of Typesense you are using locally?

Kishore Nallan

05/18/2021, 6:06 AM

I just tried it on the dataset you shared (with 42K records), and I am not able to reproduce the latency. When I use curl + timing, the entire query including the response finishes in about

74 ms

. This is on Typesense v0.20.0 Docker image. I will email you the query snippet I used so that we can compare results.

Al Mo

05/18/2021, 6:16 AM

Hi again Kishore, ok. So, you mean the whole multisearch 'query' right (all of them bundled)?

Kishore Nallan

05/18/2021, 6:16 AM

Yup check your email for the exact snippet.

Al Mo

05/18/2021, 6:17 AM

I left out a few in my pic, but still you should be able to see a big difference between the sum of all search_time_ms and the measured time for the whole thing (or not?).

Kishore Nallan

05/18/2021, 6:22 AM

I am measuring the end to end curl request time taken. The gist has the curl request I'm sending..if you can run the same on your localhost we can compare the times.

Al Mo

05/18/2021, 6:23 AM

Ok, let me check.

Al Mo

05/18/2021, 6:48 AM

Hi, kishore, I got your email but I'm really tired now (2am), I'll go sleep a bit and come back tomorrow with the results you asked. Have good day/night!

Kishore Nallan

05/18/2021, 6:50 AM

No problem, good night!

Al Mo

05/19/2021, 2:08 AM

Hi Kishore, I sent you an email some time ago, but I just wanted to add something. I've also wrote some code in node.js that makes and http request to the multi_search endpoint and parses the result. Measured times are virtually identical vs. using the library for the same thing. So I guess the overhead I'm seeing is just data transfer + json parsing, not much that could be donde regarding that. The response from the servers for all my queries combined is about 2Mb, so that explains it. I'll wait until there's a way to disable fields from the response, that's the bottleneck in my case.

Kishore Nallan

05/19/2021, 2:16 AM

Got it. Have you tried using

exclude_fields

Al Mo

05/19/2021, 2:28 AM

Let me see what happens if I remove the 'text' field which is the heaviest.

Al Mo

05/19/2021, 3:31 PM

(forgot to come back to that) it doesn't make big difference apparently

Kishore Nallan

05/19/2021, 3:32 PM

Has the size of response dropped when you excluded the field?

Kishore Nallan

05/19/2021, 3:35 PM

But it certainly it seems like the issue is with client. Either json parsing or something else.

Al Mo

05/20/2021, 2:42 PM

Yes, it dropped, but processing time didn't change significantly.

Kishore Nallan

05/20/2021, 2:51 PM

Got it. We will be now looking at it from a json parsing angle. Will keep you posted.

Open in Slack

Previous Next