Right, so if a language has Unicode characters and has spaces between words, then it will out of the box with Typesense.
For CJK we’ve had to add special tokenizer to handle them.
So other languages that don’t have spaces between words would also need special tokenizers and won’t work out of the box