Understanding Search Result Variations with Filtering Parameters
TLDR SamHendley faced inconsistencies in the number of documents returned when adding more filter parameters. Jason explained it's due to Typesense limiting the number of variables checked for better performance. Increasing
max_candidates or enabling
exhaustive_search can help obtain all values.
Nov 17, 2022 (13 months ago)
The way this shows up is if I search for ‘bad’ I get 30 results (Limit is 50 so this would appear to be an ‘exhaustive listing’). These documents are spread across 5 types of documents (reported as facets). If I then filter to any of those document types I will sometimes get a much larger document count. The correct count in this case would have been 42. It’s hard to analyze the extra entries but it looks like it might be mostly cases of “harder to find” values in the middle of a string.
None of these parameters made a difference:
NumTypos: operutil.NewLit(0), DropTokensThreshold: operutil.NewLit(0), SplitJoinTokens: operutil.NewLit("off"), TypoTokensThreshold: operutil.NewLit(0), MaxCandidates: operutil.NewLit(0),
Eventually I figured it out to be related to ‘Prefix’. If I disable Prefix matching I get stable results. This isn’t a problem per-se but it’s not obvious why it is “giving up” so early and not finding all of the documents that can match the data. Any thoughts? If nothing else I’d recommend updating the documentation to indicate “prefix searching may return incomplete answers in X or Y cases”.
max_candidatesunique candidates. So if you set that value to say 10K or set
exhaustive_search: trueyou should see all values
Indexed 3015 threads (79% resolved)
Issue with 'max_candidates' in Prefix Search
Edward reported inconsistent search results when using prefix 'EXAM'. Kishore Nallan suggested using `max_candidates` and `exhaustive_search` parameters. However, `max_candidates` did not work as expected, and needs further investigation.
Querying and Performance in Typesense
Chris had a problem with a Typesense query not returning a match. Jason solved the issue by suggesting the `exhaustive_search` feature. Further performance and features of Typesense were also discussed.
Understanding 'max_candidates' and 'num_typos' Parameters in Typesense
Narayan asked about difference between 'max_candidates' and 'num_typos' parameters in typo tolerance within Typesense. Jason referred them to the documentation. Kishore Nallan offered clarity and answered Narayan's follow-up questions, as well as addressed Akash's query about case sensitivity in Typesense.