Good morning folks, I have a question regarding ho...
# community-help
s
Good morning folks, I have a question regarding how to fine-tune Typesense queries to have better results against plural words. We index product names, so for example, "_JBL Charge 5 Portable Bluetooth Speaker_". When searching for "Speaker" results are accurate, but when searching for "speakerS" (with an S) we are getting pretty bad results. I expected this to be handled as a Typo, but seems that something is off.
message has been deleted
When adding "Speakers" we first have listed some Documents, that have an exact match on "Speakers" but then none of the "Speaker" are listed.
By increasing
typo_tokens_threshold
from 5 to 9, we got a bit more results, but still not all of them.
Speaker => 97 results Speakers => 31 results *Not expecting to be equal, but the difference is pretty big.
So far we implemented a one way synonym speaker -> speakers, to improve the results.
Is there a better approach to this?
k
Have you tried increasing typo tokens threshold further? I think we might something similar to max_candidates (used for prefixes) for typo candidates.
If you can share a subset of the dataset that reproduces this issue. I can dig into it further and see what we can do to address this.
s
Results for speakerS improved after moving
typo_tokens_threshold
from 9 to 12. But now I have a trickier one. Small data set
Copy code
TV Sony 43" Class X85J
LG TV 48" Class C2 Series OLED evo 4K UHD
Samsung TV 43" Class Q60B QLED 4K
Samsung TV 65" Class The Frame QN65LS03BAFXZA
When executing search with TV, everything work as expected. Now, when searching for "TVS", there are no matches.
Copy code
{
  "q": "TVs",
  "query_by": "title",
  "page": 1,
  "per_page": 10,
  "exhaustive_search": true,
  "min_len_2typo": 8,
  "num_typos": 2,
  "typo_tokens_threshold": 9
}
k
Look up min_len_1typo and min_len_2typo params
For shorter strings we don't enable typos because it becomes very noisy sometimes. But behaviour is configurable.
s
I see
Makes sense. So longer strings like speaker, are solved by the typo_tokens_threshold, and maybe for this short values I will stick with synonyms.
k
Correct
s
Copy code
typo_tokens_threshold: 16,
min_len_1typo: 2,
min_len_2typo: 8,
Is giving great results for out catalog 🎉
👍 1
(A screenshot of an internal comparison tool that we built to compare to previous search engine plus to typesense query configurations)