#community-help

Achieving Stemming Support with Typesense

TLDR Sabyasachi asked how to implement stemming in Typesense, which Kishore Nallan explained must be handled externally. Sabyasachi later shared they created an extra field for storing stemmed content.

Powered by Struct AI

2

Jan 08, 2022 (24 months ago)
Sabyasachi
Photo of md5-3badcffdc9bad0939ba26ebfebc3bd43
Sabyasachi
06:08 AM
How are you achieving stemming support? I think Typesense does not directly support stemming, but what are you folks using as work around? (without irrelevant results showing up).
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:52 PM
You will have to handle stemming outside by stemming both text and query before sending them to Typesense.

1

Jan 10, 2022 (24 months ago)
Sabyasachi
Photo of md5-3badcffdc9bad0939ba26ebfebc3bd43
Sabyasachi
03:58 AM
Thanks.
Jan 11, 2022 (24 months ago)
Sabyasachi
Photo of md5-3badcffdc9bad0939ba26ebfebc3bd43
Sabyasachi
04:05 AM
For anyone’s future reference:

Here is what I did: added a separate field in the schema: text_stemmed. I used nltk.PorterStemmer for stemming the content and store the resultant string in text_stemmed.

While querying I use the same method to generate stemmed query string. I concatenate the original query and the stemmed query. In the search params, I added the text_stemmed at the last of the query_by param. So the exact matches are still prioritized higher.

1