<@U01PL2YSG8L> when you have some time over, could...
# community-help
e
@Kishore Nallan when you have some time over, could you eloborate a bit on how
drop_tokens_threshold
and
typo_tokens_threshold
work. I've read the docs about it but I'm not sure I understand it fully.
k
Let's say your query is "alpa beta gamma". There are 3 words/tokens in this query. Each of these tokens could contain a typo (in this case, "alpa" is wrong). When you set
typo_tokens_threshold: X
you are telling Typesense to continue generating alternative tokens from the tokens in the query that are within an edit distance of
num_typos
until you find atleast X results. You want to stop at some point, because you can keep modifying the query tokens to generate a lot of alternative tokens. Similarly, there might be no documents that contains all tokens in the query. In that case, Typesense tries to drop tokens in the query, for e.g. searching only for "beta gamma" to find relevant documents. When you set
drop_tokens_threshold: X
you are telling Typesense to continue dropping tokens from the query until X results are found.
e
ok, so the X is both cases is how many results/documents it needs to find at a minimum?
k
Yes, threshold on number of docs to continue either looking for tokens with more typos or dropping more tokens from original query.
e
the X is not about how many tokens to drop, or how many typos to allow ...
oh ok
k
Correct, X is number of docs.
Should have been named maybe
drop_tokens_num_docs
or something.
e
I understands
so how does
num_typos
play together with these previous settings?
k
num_typos
is the maximum number of typos (0, 1, 2) allowed.
First tokens with typo = 0 is used to fetch results. If not enough results found, then look for results which contain tokens with typo = 1, and do the same for typo = 2. If any point, the threshold is reached, it will stop.
e
ok understood. thank you for explaining.
is it doing the drop tokens first, or typos first? how does that work?
k
Typos first and then dropping tokens.