Understanding Typo Tolerance in Search Queries
TLDR gab sought clarity on typo tolerance settings in search operations, specifically on the discrepancy in document returns when typos are involved. Kishore Nallan explained the "num_typos" and "typo_tokens_threshold" parameters within search queries, and how they dictate typo allowance during searches.
Mar 18, 2022 (20 months ago)
It seems I misunderstand typo tolerance settings.
I have 2 documents with:
doc1 name: "Linder"
doc 2 name: "Lindenhof"
Here is the search query I'm doing
"limit_hits": 6, "num_typos": 1, "per_page": 6, "q": "Linder", "query_by": "name", "typo_tokens_threshold": 1
Only one document is returned: the document 1. The document 2 is not returned.
If one typo is allowed with "num_typos", why I'm not getting both document returned?
Kishore Nallan09:59 AM
"typo_tokens_threshold": 1-> this means that you want Typesense to continue searching with more and more typos until atleast
1document is found. In this case, since
Lindenhofis a zero-typo prefix match, that requirement is satisfied and typo relaxation is not done. If you increased
typo_tokens_thresholdto 2, the other result will show up.
My issue is with
Kishore Nallan12:01 PM
Kishore Nallan12:03 PM
But now, how can I say strictly one typo is allowed + allow typo and no typo ?
Kishore Nallan12:12 PM
Kishore Nallan12:13 PM
num_typo:1 typo_tokens_threshold : 10
typo_tokens_threshold would try to allow more typo error until 10 is reached.
So just to be sure, it means in the upper case, if I have only one document with 0 or 1 typo, and 100 other documents with 5 typos. I will get only one result?
Kishore Nallan12:19 PM
Indexed 2779 threads (79% resolved)
Understanding Typesense's `drop_tokens_threshold` and `typo_tokens_threshold`
em1nos sought clarification on Typesense's `drop_tokens_threshold` and `typo_tokens_threshold`. Kishore Nallan defined them, emphasizing that they depend on the number of documents found, not tokens or typos; `num_typos` configures the typo allowance.
Understanding Typesense Query Fuzziness and Thresholds
Ashraful was confused about different query results when applying filters in Typesense. Jason clarified the function of `drop_tokens_threshold` and `typo_tokens_threshold` options, explaining their effect on search results and their precedence.
Typesense Search Solution Issues
Rolando faced incorrect search results using Typesense. Kishore Nallan suggested changing typo parameters and upgrading Typesense version. However, undesired results persisted and need further investigation.
Issue with Typo Correction/Prefix Search and the Role of max_candidates
John noticed inconsistent search results based on max_candidates settings, and Kishore Nallan clarified its role for multi-word queries. They resolved that increasing max_candidates ensures the query isn't prematurely limited.
Addressing `num_typos` Inconsistency in Document Search
John had an issue with `num_typos` inconsistency when using prefix search. Kishore Nallan clarified the technical aspects, adjusted the aggressiveness of the feature and resolved the issue. They also discussed a limit on `num_typos` value.