Hey there I have a field that hold references codes for the typesense #community-help

Hey there! I have a field that hold "references/c...

Óscar Vicente

11/29/2024, 9:52 AM

Hey there! I have a field that hold "references/codes" for the items in my collection. Such as: • "2011_PL24-0015" • "AS-19/2024 - AJT/65177/2022" Those should be treated as whole words, including spaces and symbols. I'm performing the search through many fields at the same time, including this and a "description" one. I can't find this items by reference, since Typesense

won't treat them as single words (as expected). Is there a way to workaround this for only for one field of the

query_by

? Or any other solution?

Kishore Nallan

11/29/2024, 9:56 AM

By default

and

are ignored, so

2011_PL24-0015

is actually treated as

2011PL240015

-- with space it's not possible to ignore it as a word separator.

Óscar Vicente

11/29/2024, 9:58 AM

And doind something through filters? Or another symbol?

Kishore Nallan

11/29/2024, 9:58 AM

You can replace space with some other character then everything will be treated as a single word.

Óscar Vicente

11/29/2024, 9:59 AM

Is there a way not to ignore symbols?

Kishore Nallan

11/29/2024, 9:59 AM

Yes, check

symbols_to_index

parameter in the schema.

Óscar Vicente

11/29/2024, 10:01 AM

Thanks! That was it, I'm blind.

Óscar Vicente

11/29/2024, 10:08 AM

symbols_to_index

is missing from the link it points to https://typesense.org/docs/guide/tips-for-searching-common-types-of-data.html. But I guess the format is the same as

token_separators

, an array of string.

Óscar Vicente

11/29/2024, 10:09 AM

Can I use it for only one field?

Kishore Nallan

11/29/2024, 10:09 AM

Yes same format. And it's a global parameter, not possible to apply it per field.

Óscar Vicente

11/29/2024, 10:11 AM

Damn, I only need it for one field 😅 My use case is, I only want to apply it to the "reference" but not to the "description" as the later can also contain that symbols but they should be treated as spaces rather as characters.

Óscar Vicente

11/29/2024, 10:11 AM

Weird use cases, sorry 🤣

Kishore Nallan

11/29/2024, 10:12 AM

Haha I know, it's annoying. One of those many things we want to fix.

🤣 1

Óscar Vicente

11/29/2024, 10:13 AM

It's fine! Do you think that a left join would work without a lot of performance impact?

Kishore Nallan

11/29/2024, 10:20 AM

For small collections won't be a problem. Say upto few hundred k docs

Óscar Vicente

11/29/2024, 10:21 AM

😅 4M+ and growing. So I guess joining a 4M+ with another 4M+ won't be a good idea

Óscar Vicente

11/29/2024, 10:22 AM

Do you know if you'll target it for v28 or v29?

Kishore Nallan

11/29/2024, 10:22 AM

The thing is it might still work but case to case basis. With joins we are splitting the docs to make them smaller to it might help to reduce IO.

Kishore Nallan

11/29/2024, 10:22 AM

Target what for v28?

Óscar Vicente

11/29/2024, 10:22 AM

Moving both fields from schema to the particular field

Óscar Vicente

11/29/2024, 10:23 AM

It's just for knowing if we can wait or we should look for other workarounds

Kishore Nallan

11/29/2024, 10:24 AM

You mean the symbols configuration?

Óscar Vicente

11/29/2024, 10:24 AM

and the token_separator

Kishore Nallan

11/29/2024, 10:25 AM

I've to check the complexity. I'll get back to you..

Óscar Vicente

11/29/2024, 10:25 AM

No hurry, if it's not even on your roadmap, we'll look for workarounds

Óscar Vicente

11/29/2024, 10:25 AM

I just understood itwas, as you said it was one of those many things you wanted to fix

Kishore Nallan

11/29/2024, 10:25 AM

The work around will be to do a multi search with the modified version of query and then to merge results.

Óscar Vicente

11/29/2024, 10:26 AM

Naah, that would impact the ranking and sorting of stuff

Óscar Vicente

11/29/2024, 10:26 AM

It's not worth it to go down that rabbithole

Óscar Vicente

11/29/2024, 10:27 AM

I think I will escape the characters in url encoding in a hidden field and live with it xD

Kishore Nallan

11/29/2024, 10:27 AM

Yeah

Open in Slack

Previous Next