#community-help

Querying Normalized Typesense Index Results in Unexpected Matches

TLDR Jonathan questioned if queries should return "jill" and Turkish "jıll" when searching for "jill". Jason confirmed this is expected but suggested opening a GitHub issue for their specific use case which Jonathan completed.

Powered by Struct AI
+11
10
8mo
Solved
Join the chat
Jan 19, 2023 (8 months ago)
Jonathan
Photo of md5-1ac34e3717bc718eb955ab69034d52d6
Jonathan
07:24 PM
Is it expected for queries, even with ?num_typos=0, to return both jill and jıll (turkish i) when searching for jill, if your indexed data has the "dotless i" (turkish) https://www.compart.com/en/unicode/U+0131 ?

[
  { "name": "jill" },
  { "name": "jıll" }
]
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:26 PM
Typesense normalizes those internally so jıll gets indexed as jill… So it’s expected behavior, at least at the moment.
07:26
Jason
07:26 PM
Does that not work for your use-case?
Jonathan
Photo of md5-1ac34e3717bc718eb955ab69034d52d6
Jonathan
07:27 PM
use-case is username search for making a payment, so a fuzzy match obviously would be bad
07:28
Jonathan
07:28 PM
this issue is somewhat similar: https://github.com/typesense/typesense/issues/262#issuecomment-844631342 but it was solved via the num_typos
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:29 PM
Hmmm, username search for payments is a good use case. Could you open a GitHub issue reporting this, along with your use case?
07:29
Jason
07:29 PM
CC: Kishore Nallan
Jonathan
Photo of md5-1ac34e3717bc718eb955ab69034d52d6
Jonathan
07:29 PM
yep, and ty for quick response (as usual)
+11
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:57 PM
Perfect, thank you!