Discussing Typesense's Tokenization Feature
TLDR Roshan seeks to understand typesense's tokenization feature. Kishore Nallan explains that it tokenizes on spaces and suggests using a special character as a separator.
Feb 18, 2022 (21 months ago)
Kishore Nallan11:53 AM
Kishore Nallan11:54 AM
text, then I want
Hello worldto be one and
everyoneto be another token Kishore Nallan
Kishore Nallan12:10 PM
Kishore Nallan12:29 PM
Indexed 2779 threads (79% resolved)
Resolving Typesense Search Issues
Conversation started by Maximilian about Typesense search behavior led to Users Kishore Nallan and Mike discussing and suggesting workaround, with Kishore Nallan promising an official solution soon. No final confirmation of resolution provided.
Restricting `token_separators` to a Specific Field in Typesense
Loic asked Jason about applying `token_separators` to a specific field in Typesense. Jason suggested opening a github issue to add this feature.
Tokenization and Indexing Fields with Typesense
kam wanted to understand how to control tokenization and indexing for certain fields. Jason explained that tokenization is applied during search queries and not during the indexing phase, and shared how to delete a document using an indexed unique value under `id`.
Handling Two-Word Queries with Custom Separators
Dima proposes adding a parameter to API for handling two-word queries. Jason suggests opening a GitHub issue for the feature request.
Issues with Repeated Words and Hyphen Queries in Typesense API
JinW discusses issues with repeated word queries and hyphen-containing queries in Typesense. Kishore Nallan offers possible solutions. During the discussion, Mr seeks advice on `token_separators` and how to send custom headers. Issues remain with repeated word queries.