I searched for SOLR and got a hit because of solve...
# community-help
n
I searched for SOLR and got a hit because of solve. (I am searching in documents full of IT and tech-related terms and abbreviations) Well, this is not what i expected ))) Probably a side effect of stemming (i set
"stem": true
for this field in my documents). Actually i have a long list of (almost all?) tech-related terms and abbreviations, things like SOLR, C++ and so on. Can I somehow tell Typesense "do not stem these words"?
k
It's not possible to exclude words from being stemmed.
n
Well... you can take a look at implementation in Apache Lucene: To exclude specific words from being stemmed in Apache Lucene, you can use the KeywordMarkerFilter. This filter allows you to mark specific terms as keywords, which will then be excluded from stemming during the analysis process. https://lucene.apache.org/core/9_11_1/analysis/common/org/apache/lucene/analysis/miscellaneous/KeywordMarkerFilter.html
Any workaround for this? What if create TWO almost similar text fields in Typesense. One field is stemmed, another - not stemmed. "Not stemmed" will have higher priority in search... would it help?
Searching for "SOLR" to force exact match? (it seems - it does not work at the moment)
k
I can't think of a work around apart from managing stemming of query and documents on your end before sending them to typesense. Please create a github issue for a feature request. We prioritize feature requests based on community feedback.
2