#random

Debugging Searches and Understanding Search Term Matching

TLDR Helder sought help in understanding mismatched search terms. Kishore Nallan explained prefix search behavior and suggested creating a synonym for "fuji" and "fujifilm."

Powered by Struct AI
4
6mo
Solved
Join the chat
May 22, 2023 (6 months ago)
Helder
Photo of md5-8b19dd62e3d1daa2a391387ff467175e
Helder
12:28 PM
Hi, we are debugging some searches.
Does anyone know the best way to understand faster why some search term wasn’t matched with some documents.
I know that experience in the parameters is the best, but we are still learning all the parameters
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:30 PM
One reason is when there is no document that contains all tokens in the query, then Typesense proceeds to drop words in the query right to left and then left to right until the number of documents fetched satisfy the drop_tokens_threshold parameter.

In such a scenario, documents that do not contain all search terms will occur.
Helder
Photo of md5-8b19dd62e3d1daa2a391387ff467175e
Helder
12:32 PM
It was just a bad example from my part.

I have matches, but still don’t understand what we need to tweak in order to solve:

Search tearm: “fuji máquina”

Against: máquina Fujifilm
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:37 PM
Ah, Typesense does prefix searches only on the last word of the query. So máquina fuji will match and not the other way. You can create a synonym here for mapping fuji to fujifilm

The reason why we do prefix searching on only the last word is that, otherwise the search results can become quite noisy.