<@U01PL2YSG8L> <@U02UEA69DB9> <@U02B82QLXLP> <@U02...
# community-help
r
@Kishore Nallan @Kishore S @Akash Joshi @Ankur Golwa Does typesense has tokenization feature? I want to split search query into tokens(for grouping of text etc)
If it doesn't have this feature natively, how can we use tokenization while querying and indexing?
k
Can you give me an example of tokenization you wish to do?
Typesense tokenizes on space and also if you define custom separators, it considers those as well.
r
ok, like if I want to tokenize "Hello word everyone" text, then I want
Hello world
to be one and
everyone
to be another token @Kishore Nallan
k
We don't have a way to customize that behavior. You can always choose to tokenize before indexing by combining words using a symbol like "hello_world" and likewise do the same with the query as well.
r
ok, suppose I have my own tokenizer , then how can i pass those tokenized list of words while indexing to typesense? and how it will work on search? @Kishore Nallan
k
You've to use a special character as a separator rather than space. Then add that character as symbols_to_index configuration.