typesense

Hi team! I have a strange feature request, maybe you can help me with a workaround :thinking_face:

In my text dataset, I have some terms that are very similar to english words, but have an additional meaning. They are usually weird names of products, services or companies, similar to how `C++` and `C#` are the same token as `C`, and `.NET` is the same token as `NET`. Because the tokenizer legitimately removes punctuation marks from the text, users have a hard time finding exact matches for such search queries, they have to learn about quotes, see that this is the case, and use them only around the term.

I could enable symbols_to_index, add `.+#!` to it, but it will probably worsen the overall quality of search results (e.g. if an author of a text missed a space somewhere and word stuck to the punctuation mark). I have a list of such terms, so can I instruct the tokenizer to keep them as they are? Or build a workaround to disable typo tolerance and punctuation mark stripping for some words in the search query.