Understanding Search Result Variations with Filtering Parameters
TLDR SamHendley faced inconsistencies in the number of documents returned when adding more filter parameters. Jason explained it's due to Typesense limiting the number of variables checked for better performance. Increasing max_candidates
or enabling exhaustive_search
can help obtain all values.
Nov 17, 2022 (11 months ago)
SamHendley
09:05 PMThe way this shows up is if I search for ‘bad’ I get 30 results (Limit is 50 so this would appear to be an ‘exhaustive listing’). These documents are spread across 5 types of documents (reported as facets). If I then filter to any of those document types I will sometimes get a much larger document count. The correct count in this case would have been 42. It’s hard to analyze the extra entries but it looks like it might be mostly cases of “harder to find” values in the middle of a string.
None of these parameters made a difference:
NumTypos: operutil.NewLit(0),
DropTokensThreshold: operutil.NewLit(0),
SplitJoinTokens: operutil.NewLit("off"),
TypoTokensThreshold: operutil.NewLit(0),
MaxCandidates: operutil.NewLit(0),
Eventually I figured it out to be related to ‘Prefix’. If I disable Prefix matching I get stable results. This isn’t a problem per-se but it’s not obvious why it is “giving up” so early and not finding all of the documents that can match the data. Any thoughts? If nothing else I’d recommend updating the documentation to indicate “prefix searching may return incomplete answers in X or Y cases”.
Jason
09:36 PMmax_candidates
unique candidates. So if you set that value to say 10K or set exhaustive_search: true
you should see all valuesSamHendley
09:44 PMJason
09:45 PMTypesense
Indexed 2786 threads (79% resolved)
Similar Threads
Issue with 'max_candidates' in Prefix Search
Edward reported inconsistent search results when using prefix 'EXAM'. Kishore Nallan suggested using `max_candidates` and `exhaustive_search` parameters. However, `max_candidates` did not work as expected, and needs further investigation.
Querying and Performance in Typesense
Chris had a problem with a Typesense query not returning a match. Jason solved the issue by suggesting the `exhaustive_search` feature. Further performance and features of Typesense were also discussed.
Modifying Exhaustive Search Behavior in Document Retrieval
Laura inquired about tweaking exhaustive search parameters for document retrieval. Jason clarified that using `exhaustive_search: true` considers all variations and prefixes but doesn't necessarily return a fixed number of documents. Laura acknowledged the misunderstanding.