Hello Everyone, I have confusions regarding to Dro...
# community-help
r
Hello Everyone, I have confusions regarding to Drop Token
drop_tokens_threshold
. This is the drop token description from doc _If
drop_tokens_threshold
is set to a number
N
and a search query contains multiple words (eg:
wordA wordB
), if at least
N
results with both
wordA
and
wordB
in the same document are not found, then Typesense will drop
wordB
and search for documents with just
wordA
. Typesense will keep dropping keywords like this left to right and/or right to left, until at least
N
documents are found. Words that have the least individual results are dropped first. Set
drop_tokens_threshold
to
0
to disable dropping of words (tokens)._ Default:
1
So, I am assuming Searching for Hello World Something, if it doesn't find
N
records. It will drop token from right. Thus, Something will be dropped. Then TS will search for Hello World. For example: dataset: AUS SA BD AUS SA IND BD AUS SA IND Search query: BD AUS SA IND drop token threshold is set 1000 mode: right_to_left What result should I expect? Another confusion is: Doc says, Typesense will keep dropping keywords like this left to right and/or right to left, until at least
N
documents are found
But also, Words that have the least individual results are dropped first I think both lines are contradictory. Can you please explain this?
f
Words that have the least individual results are dropped first: This has to do with when you use both left-to-right and right-to-left so for your example: Query: "Hello World Something" N = 1000 If individual frequencies are: Hello: 10000 docs World: 5000 docs Something: 2000 docs Dropping sequence: 1. "Something" (least frequent) 2. "World" (next least frequent) 3. "Hello" (most frequent)
If it's just on right-to-left or left-to-right it will just use that order to drop tokens until the length of the results matches the one specified