Typesense's Support for Korean Language Segmentation
TLDR Soul asked if typesense supports full segmentation for Korean language like Elasticsearch's Nori plugin. Kishore Nallan clarified that they don't use Nori.
Aug 15, 2023 (1 month ago)
Soul
02:16 AMKishore Nallan
02:56 AMSoul
11:03 AMhttps://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori.html
Nori tokenize korean words and extract nouns for more accurate search index.
Does typesense tokenize korean languages to more accurate search engine?
Aug 17, 2023 (1 month ago)
Kishore Nallan
12:03 PMTypesense
Indexed 2764 threads (79% resolved)
Similar Threads
Troubleshooting Typo Tolerance Issue with Typesense for Korean
Minyong informed Kishore Nallan about a typo tolerance issue in Typesense with Korean text. Kishore Nallan suggested adjusting the byte difference limit for Korean, but warned this could slow down the search function. Minyong approved testing the solution.



Discussing Typesense's Tokenization Feature
Roshan seeks to understand typesense's tokenization feature. Kishore Nallan explains that it tokenizes on spaces and suggests using a special character as a separator.
Seeking Help for Locale Schema Option
David asked about the locale schema option and its documentation. Kishore Nallan explained it's a bit undocumented, but provided an example for Korean. David then expressed their e-commerce store use-case, with Kishore Nallan suggesting separate collections. Minyong also received directions regarding Korean support from Kishore Nallan.

