Hi All, just wondering if anyone knows whether `_t...
# community-help
p
Hi All, just wondering if anyone knows whether
_text_match
should be affected by the number of times a token is found within a field? Currently I’m seeing the same score for hits, regardless how many times the token appears within the field. For example, searching for a single word, any result that contains that token at least once, is returned but with the same score even though one result has the token 10 times in the body field and another result only has 1 occurrence of the token.
k
We don't take number of repeating tokens into account for match_score. We used to in earlier versions but caused so many false positives due to "keyword stuffing" that we decided to not use it anymore.
p
hmm, but on sites where there is no user generated content it doesn’t make sense that an article that referrers to something 100 times, would be equal to an article that only references it once. It means having to add keyword field to articles rather than looking at the natural content of an article. Is there really no way to make this work? There isn’t any tie-breaking I can apply, so is typesense not really suitable for long form content?
j
For long form content, you want to break it out into multiple documents by say paragraphs or lines to increase the granularity of search results, which in turn improves relevance.