Hey everyone We ve been doing lots of work recently to try a typesense #community-help

Hey everyone, We’ve been doing lots of work recen...

Luke Hill

02/27/2025, 3:43 PM

Hey everyone, We’ve been doing lots of work recently to try and improve the speed of our requests as they are a little on the slow side unfortunately and we’ve become a bit stuck with numerical filtering. We have a price filter which could be any value from

100_000_000

and in our UI we use a range slider to allow the user to select what prices they would like to see. We then form a filter like so

filter_by: price:[100..500000]

as an example. We have about 3.5 million records and on average this search takes 1 second to complete. Our current ideas are to try all of the following: • Adding

ranged_index

to the schema ◦ We’ve done this for part of our data set and so far we are only seeing a very small improvement • Increase RAM • Move to run in a VM • Move to Typesense cloud and if all of that doesn’t help, then we have thought about trying to turn it into a string based filter with a set of ranges, so something priced at

would be assigned

priceFilter: '0'

(range 0 to 100) and something priced at

would get

priceFilter: '10000'

(range 10_000 to 20_000). We would then turn this into a filter like so

filter_by: (priceFilter:='0' || priceFilter:='100' || priceFilter:='1000' || priceFilter:='10000')

We would like to avoid doing this though as it would be a huge undertaking to apply this new

priceFilter

to all 3.5 million records. If there is a better way or something that we are missing I would really appreciate some advice. Thanks in advance.

Kishore Nallan

02/27/2025, 3:48 PM

I'm surprised that adding

ranged_index

does not help. Could it be possible that there are other parameters in the search query which is actually slowing it down (or maybe range_index in combination with those becomes ineffective)?

Kishore Nallan

02/27/2025, 3:49 PM

Would you be able to share a sample query that's slow here?

Luke Hill

02/27/2025, 3:53 PM

We found about a

40ms

improvement from

ranged_index

although we’re not completely finished with our testing of that yet

Luke Hill

02/27/2025, 3:53 PM

So as an example, this is a query without any price, and it took

97ms

Copy code

curl '***/search/multi_search?use_cache=true' \
  -H 'x-typesense-api-key: ***' \
  --data-raw '{
   "searches":[
      {
         "collection":"lots",
         "exclude_fields":"footnotes,catalogDesc",
         "filter_by":"(department.name:=Collector Cars) && (status:=NEW)",
         "facet_by":"brand,country.name,country.code,department.code,department.name,flags.isWithoutReserve,price.GBPHighEstimate,price.GBPLowEstimate,price.estimateHigh,price.estimateLow,status,groups",
         "query_by":"title",
         "sort_by":"price.GBPLowEstimate:desc",
         "page":1,
         "per_page":48,
         "max_facet_values":1000,
         "q":""
      },
      {
         "collection":"lots",
         "exclude_fields":"footnotes,catalogDesc",
         "filter_by":"(status:=NEW)",
         "facet_by":"department.name",
         "query_by":"title",
         "sort_by":"price.GBPLowEstimate:desc",
         "page":0,
         "per_page":0,
         "max_facet_values":1000,
         "q":""
      },
      {
         "collection":"lots",
         "exclude_fields":"footnotes,catalogDesc",
         "filter_by":"(department.name:=Collector Cars)",
         "facet_by":"status",
         "query_by":"title",
         "sort_by":"price.GBPLowEstimate:desc",
         "page":0,
         "per_page":0,
         "max_facet_values":1000,
         "q":""
      }
   ]
}'

Luke Hill

02/27/2025, 3:55 PM

And here is an example with a price filter which took

1470ms

Copy code

curl '***/search/multi_search?use_cache=true' \
  -H 'x-typesense-api-key: ***' \
  --data-raw '{
   "searches":[
      {
         "collection":"lots",
         "exclude_fields":"footnotes,catalogDesc",
         "filter_by":"(price.GBPLowEstimate:[0..276797] || price.GBPHighEstimate:[0..276797]) && (department.name:=Collector Cars) && (status:=NEW)",
         "facet_by":"brand,country.name,country.code,department.code,department.name,flags.isWithoutReserve,price.GBPHighEstimate,price.GBPLowEstimate,price.estimateHigh,price.estimateLow,status,groups",
         "query_by":"title",
         "sort_by":"price.GBPLowEstimate:desc",
         "page":1,
         "per_page":48,
         "max_facet_values":1000,
         "q":""
      },
      {
         "collection":"lots",
         "exclude_fields":"footnotes,catalogDesc",
         "filter_by":"(price.GBPLowEstimate:[0..276797] || price.GBPHighEstimate:[0..276797]) && (status:=NEW)",
         "facet_by":"department.name",
         "query_by":"title",
         "sort_by":"price.GBPLowEstimate:desc",
         "page":0,
         "per_page":0,
         "max_facet_values":1000,
         "q":""
      },
      {
         "collection":"lots",
         "exclude_fields":"footnotes,catalogDesc",
         "filter_by":"(price.GBPLowEstimate:[0..276797] || price.GBPHighEstimate:[0..276797]) && (department.name:=Collector Cars)",
         "facet_by":"status",
         "query_by":"title",
         "sort_by":"price.GBPLowEstimate:desc",
         "page":0,
         "per_page":0,
         "max_facet_values":1000,
         "q":""
      }
   ]
}'

Kishore Nallan

02/27/2025, 3:55 PM

"max_facet_values":1000,

Is probably taking the most time here.

Kishore Nallan

02/27/2025, 3:56 PM

Also,

Copy code

(price.GBPLowEstimate:[0..276797] || price.GBPHighEstimate:[0..276797])

Is probably something that query optimizer is not equipped to handle. ORing of two range queries.

Kishore Nallan

02/27/2025, 3:57 PM

This can be a pretty intensive operation because prices can be so sparse.

Luke Hill

02/27/2025, 3:58 PM

Yeah, and unfortunately as we are an auction house, we have low estimate price and high estimate price which means we have to perform the search on both to make sure we include the correct results for the customer

Luke Hill

02/27/2025, 3:58 PM

Could you explain a little more about the

max_facet_values

please and what we could do to improve there?

Kishore Nallan

02/27/2025, 3:59 PM

With

max_facet_values

generally having large values like

can hurt if the facets are high cardinality in nature. Try reducing it to say,

and check what response time you get?

Kishore Nallan

02/27/2025, 4:00 PM

Also, these queries are running sequentially, i.e. every query in multi search is processed one after the other. Perhaps we can add an option to run them parallely. However you need lots of CPU cores if you already have high concurrency of number of searches happening across users.

Luke Hill

02/27/2025, 4:00 PM

I just did a

curl

with it reduced to 10 and it’s come down by approximately

200ms

but still on average about

1200ms

for the request

Luke Hill

02/27/2025, 4:02 PM

We have a very high user base, my team are looking to bump the RAM in each pod to 32GB

Luke Hill

02/27/2025, 4:02 PM

CPU cores I am not sure about

Kishore Nallan

02/27/2025, 4:03 PM

CPU cores is what you need, not RAM

Luke Hill

02/27/2025, 4:03 PM

If there a better way to structure the query at all, we have tried ordering things differently but it doesn’t seem to help

Kishore Nallan

02/27/2025, 4:03 PM

I will run it by someone and see if we can make this more organized for efficiency

Luke Hill

02/27/2025, 4:04 PM

Thank you we’d really appreciate that

2 Views

Open in Slack

Previous Next