Hey there! I have a field that hold "references/c...
# community-help
ó
Hey there! I have a field that hold "references/codes" for the items in my collection. Such as: • "2011_PL24-0015" • "AS-19/2024 - AJT/65177/2022" Those should be treated as whole words, including spaces and symbols. I'm performing the search through many fields at the same time, including this and a "description" one. I can't find this items by reference, since Typesense
q
won't treat them as single words (as expected). Is there a way to workaround this for only for one field of the
query_by
? Or any other solution?
k
By default
_
and
-
are ignored, so
2011_PL24-0015
is actually treated as
2011PL240015
-- with space it's not possible to ignore it as a word separator.
ó
And doind something through filters? Or another symbol?
k
You can replace space with some other character then everything will be treated as a single word.
ó
Is there a way not to ignore symbols?
k
Yes, check
symbols_to_index
parameter in the schema.
ó
Thanks! That was it, I'm blind.
symbols_to_index
is missing from the link it points to https://typesense.org/docs/guide/tips-for-searching-common-types-of-data.html. But I guess the format is the same as
token_separators
, an array of string.
Can I use it for only one field?
k
Yes same format. And it's a global parameter, not possible to apply it per field.
ó
Damn, I only need it for one field 😅 My use case is, I only want to apply it to the "reference" but not to the "description" as the later can also contain that symbols but they should be treated as spaces rather as characters.
Weird use cases, sorry 🤣
k
Haha I know, it's annoying. One of those many things we want to fix.
🤣 1
ó
It's fine! Do you think that a left join would work without a lot of performance impact?
k
For small collections won't be a problem. Say upto few hundred k docs
ó
😅 4M+ and growing. So I guess joining a 4M+ with another 4M+ won't be a good idea
Do you know if you'll target it for v28 or v29?
k
The thing is it might still work but case to case basis. With joins we are splitting the docs to make them smaller to it might help to reduce IO.
Target what for v28?
ó
Moving both fields from schema to the particular field
It's just for knowing if we can wait or we should look for other workarounds
k
You mean the symbols configuration?
ó
and the token_separator
k
I've to check the complexity. I'll get back to you..
ó
No hurry, if it's not even on your roadmap, we'll look for workarounds
I just understood itwas, as you said it was one of those many things you wanted to fix
k
The work around will be to do a multi search with the modified version of query and then to merge results.
ó
Naah, that would impact the ranking and sorting of stuff
It's not worth it to go down that rabbithole
I think I will escape the characters in url encoding in a hidden field and live with it xD
k
Yeah