#community-help

Typesense Advanced Search Result Capping Issue and Potential Solutions.

TLDR David raised an issue with Typesense advanced search results capping at 250 items per page which excludes intersecting higher results. David suggests either increasing the limit or running intersection serverside. Kishore Nallan agrees to consider both proposals.

Powered by Struct AI
Oct 17, 2023 (1 month ago)
David
Photo of md5-5c5edeceeb0deef59cc5dcc791ce7045
David
03:35 PM
We've implemented typesense and are building "advanced search" tooling. Our system is returning document id's of those that match the search criteria.

To do advanced search (logic against multiple fields) we accept a number of fields to issue a multi search across (eg: name=David, company=Typesense), emit the multisearch for non-null fields, and then find the intersection of the returned result sets.

However the result sets are capped at 250 items per page, and so it's possible that one set is >250 and there is an intersection at result 251 or higher with one of the other result sets below result 250. For example, we match highly on one document on name and it is #1 in the name search, but it is a lower match on the company and it is result 251, we would exclude this document unless we page through all the searches. Our collections are relatively large (60k+) and we're worried that there would, in effect, be hundreds of pages that we'd have to scan.

Is there any recommendation for handling this? Right now we just show a warning that "Too many results returned, try narrowing your search" but this is far from ideal.
03:52
David
03:52 PM
We're considering building from source and increasing the per_page max and enforcing reasonable search queries on the application side, is there any danger to this? I don't exactly see why 250 was chosen, other than it's not too many and not too few.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:21 PM
There's a cost to be paid for sorting so a larger per page number is going to take longer. We've debated about increasing this number but it has a lot of potential for abuse. Perhaps we could bring in a command line flag for people to enable that can increase this limit.
David
Photo of md5-5c5edeceeb0deef59cc5dcc791ce7045
David
04:25 PM
That would be very helpful. In cases like ours, an advanced search operation is understood to take longer to provide a more "accurate" result, I think the tradeoff is acceptable.

The most ideal solution would be to run the intersection serverside before sorting, since we are only interested in the intersection of the multisearch and don't need to spend any computing time sorting results that will be filtered out client side
04:25
David
04:25 PM
If I were to open a github issue for that feature, would it be considered if there was enough interest?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:35 PM
Yes definitely. Please create one. Meanwhile we can look into lifting the limits via parameter