TLDR Carlo asked about implementing stemming, lemmatization, stopwords with Typesense. Kishore Nallan suggested the Porter stemmer and mentioned stopwords is under development. Gustavo suggested using GPT-3.5-Turbo.
Porter stemmer is the most popular stemming library used. You have to stem the values during indexing and also stem the queries before sending to Typesense. However, I suspect that most people don't use stemmers because prefix searching & typo correction is usually enough to handle plurals etc.
Stopwords is under development.
thnx!
You can also use GPT-3.5-Turbo to do that as well as add synonyms, labels, categories, etc.
Carlo
Thu, 13 Jul 2023 06:10:14 UTCI've seen a ticket that stekming, lemmatization, stopwords aren't currently supported by typesense. Has anyone succesfully implemented that before it reaches typesense, or know a good workaround?