I’m having issue with exact match and mostly for r...
# community-help
j
I’m having issue with exact match and mostly for repeated words. e.g https://songs-search.typesense.org/?songs_1630520530850%5Bquery%5D=boom%20boom “Boom Boom” should be first imo. What’s the best way to deal with this?
k
Have you tried setting
prioritize_exact_match=true
?
j
I thought prioritize_exact_match is true be default based on the doc.
k
For some demos we don't set it to true, as it makes the results very monotonous for simple queries like "pizza" if your data set has multiple titles with just that word. I am not sure what configuration songs uses. It might very well be an issue with repeated tokens, but wanted to mention that possibility.
j
Yea it is set to true, it seems to ignore the repeated words
k
Got it, thanks for confirming. Can you please create a quick issue on Github for this? I am actually working on some changes in this area so I can fix it as part of that for the next release!
👍 1
j
Since I got your attention here. How do I deal with a search query that contains hyphen? “One Two-Three” should return “One Two Three” and vice versa. I could fix this issue by removing it on my end but it’s not exactly ideal
k
The latest 0.22 rc builds have a configuration to specify custom characters as separators, which in this case would by hyphen.
j
Thank you
Check
token_separators
🙌 3
m
where I have to add
token_separators
? when I create collection format or ??
k
Yes during collection creation.
m
@Kishore Nallan when I add
token_separators
, I can't find it in Response type (
k
What version of Typesense are you using? It's available only in recent 0.22 RC builds.
m
I use the version
0.21
, it means I have to update to
0.22
?
k
Yes, correct. 0.22 is not out yet, but we publish pre-release builds. These are pretty stable now and can be used.
🙌 1
m
Ok , thanks a lot !🙌
Is it available docker image for
v0.22.0
? How can I test it ??
k
Yes it is available.
Check
0.22.0.rcs22
Docker image.
🙌 1
m
ok , thanks
@Kishore Nallan I have one more question 🙃. Is it possible to send headers on query. I need to send {Bearer Token} on headers.
k
I don't follow you. Custom headers to Typesense?
m
@Kishore Nallan yes )
k
And what should Typesense do with that?
m
@Kishore Nallan
Copy code
Sorry, for making a confusion here. The question sounded a bit silly without the context. 
We build a proxy backend (only for read operations), and this proxy behaves as is Typesense API, so that we can use Algolia Instant Search UI lib with no customization.
The only problem is that we want to pass a bearer token in a special header so our backend proxy could process it and if it's allowed forward the request to typesense.

We couldn't find a way to send a custom header through InstantSearch API. Maybe it's better to target this question to the Algolia community, but wanted to check with you guys first. Maybe you've already faced this one before
j
@Kishore Nallan I also confirmed that the issue is also in 0.22.0.rcs22 for repeated tokens
@Jason Bosco @Kishore Nallan Do you know if anyone looking int o this one. Thank you
j
@JinW Looks like some of the fixes in v0.22 have addressed this issue:
j
Oh that’s weird, i can’t seem to get it to work on my end. Let me confirm it again. Is that in the latest released build or rcs build
j
That's in the version released publicly yesterday 0.22
j
Yea it didn’t really fix it.
merry merry
or
pasta pasta
. “Merry Merry Christmas” or “Pasta Pasta” should be first in the result.
boom boom
results are still not in the correct exact order.
message has been deleted
j
Oh well... Will take a closer look in the coming weeks
❤️ 1
k
Can you please confirm that you're setting
?prioritize_exact_match=true
?
j
@Kishore Nallan Yes, it is set to true.
k
Ok got it. Do you have a small dataset on which this is trivially reproduceable? Like maybe a test set with 4-5 documents.
j
Here is an example dataset I created. The query is
mong mong
. And it will never show up first.
k
Thanks I will check
❤️ 1
j
I think it should at least show up in this order since the token drop from RTL.
Copy code
{"id":"9", "title": "Mong Mong"}
{"id":"26", "title": "Mong Mong Racoon"}
{"id":"4", "title": "Mong Mong SATELLITES JAPAN"}
{"id":"18", "title": "Mong Mong Zi Zung"}
{"id":"7", "title": "Mong Mong Tapo & Raya"}
Update: We temporary solved this issue by storing the string “Mong Mong” without space and change the query for repeated words “MongMong”. (not ideal but it works for now)