Understanding Typo Tolerance in Search Queries
TLDR gab sought clarity on typo tolerance settings in search operations, specifically on the discrepancy in document returns when typos are involved. Kishore Nallan explained the "num_typos" and "typo_tokens_threshold" parameters within search queries, and how they dictate typo allowance during searches.
2
Mar 18, 2022 (20 months ago)
gab
09:54 AMIt seems I misunderstand typo tolerance settings.
I have 2 documents with:
doc1 name: "Linder"
doc 2 name: "Lindenhof"
Here is the search query I'm doing
"limit_hits": 6,
"num_typos": 1,
"per_page": 6,
"q": "Linder",
"query_by": "name",
"typo_tokens_threshold": 1
Only one document is returned: the document 1. The document 2 is not returned.
If one typo is allowed with "num_typos", why I'm not getting both document returned?
Thanks
Kishore Nallan
09:59 AM"typo_tokens_threshold": 1
-> this means that you want Typesense to continue searching with more and more typos until atleast 1
document is found. In this case, since Lindenhof
is a zero-typo prefix match, that requirement is satisfied and typo relaxation is not done. If you increased typo_tokens_threshold
to 2, the other result will show up.gab
10:01 AMMy issue is with
"q": "Linder"
param.1
gab
11:59 AMKishore Nallan
12:01 PMKishore Nallan
12:03 PMgab
12:07 PMBut now, how can I say strictly one typo is allowed + allow typo and no typo ?
Kishore Nallan
12:12 PMgab
12:13 PMKishore Nallan
12:13 PMgab
12:17 PMnum_typo:1
typo_tokens_threshold : 10
typo_tokens_threshold would try to allow more typo error until 10 is reached.
So just to be sure, it means in the upper case, if I have only one document with 0 or 1 typo, and 100 other documents with 5 typos. I will get only one result?
Kishore Nallan
12:19 PMgab
12:22 PM1
Typesense
Indexed 2779 threads (79% resolved)
Similar Threads
Understanding Typesense's `drop_tokens_threshold` and `typo_tokens_threshold`
em1nos sought clarification on Typesense's `drop_tokens_threshold` and `typo_tokens_threshold`. Kishore Nallan defined them, emphasizing that they depend on the number of documents found, not tokens or typos; `num_typos` configures the typo allowance.
Understanding Typesense Query Fuzziness and Thresholds
Ashraful was confused about different query results when applying filters in Typesense. Jason clarified the function of `drop_tokens_threshold` and `typo_tokens_threshold` options, explaining their effect on search results and their precedence.
Typesense Search Solution Issues
Rolando faced incorrect search results using Typesense. Kishore Nallan suggested changing typo parameters and upgrading Typesense version. However, undesired results persisted and need further investigation.
Issue with Typo Correction/Prefix Search and the Role of max_candidates
John noticed inconsistent search results based on max_candidates settings, and Kishore Nallan clarified its role for multi-word queries. They resolved that increasing max_candidates ensures the query isn't prematurely limited.
Addressing `num_typos` Inconsistency in Document Search
John had an issue with `num_typos` inconsistency when using prefix search. Kishore Nallan clarified the technical aspects, adjusted the aggressiveness of the feature and resolved the issue. They also discussed a limit on `num_typos` value.