#community-help

Discussion on Typesense Raw and Altered Text Match Scores

TLDR Weilin requested that Typesense provide both raw and altered 'text match' scores for ranking. Jason asked for a GitHub issue and promised to address it next week. Kishore Nallan then clarified the current implementation already includes the raw score.

Powered by Struct AI
+12
raised_hands1
10
12mo
Solved
Join the chat
Aug 25, 2022 (13 months ago)
Weilin
Photo of md5-2483161dd2b7cb7554b83a2dccb17f10
Weilin
07:53 PM
hello! According to this, when we bucket our results all the scores are forced into the same textmatch score within the bucket. Is there a way we could have a raw_text_match score that never gets altered and a bucketed_text_match score that does get altered? This way we can bucket but effectively keep the raw score intact for other use cases. _E.g. if we have a perfect ranking and we just want to divide those into buckets of 10, i don’t want that perfect ranking to be messed up and have to reorganize each bucket based on a user defined attribute
Aug 26, 2022 (13 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:02 AM
Currently we send the altered text match score. You want to have the raw unaltered text match score in the result as well?
Weilin
Photo of md5-2483161dd2b7cb7554b83a2dccb17f10
Weilin
02:33 PM
yes, that would be helpful because the altered text match score is the same for so many results, it doesn’t help us rank. It would definitely be more useful to use the altered text match score to bucket, and the raw text match score to rank within those buckets
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
03:56 PM
Could you open a Github issue for this request so we can track it?
Aug 31, 2022 (12 months ago)
Weilin
Photo of md5-2483161dd2b7cb7554b83a2dccb17f10
Weilin
09:19 PM
Apologies for the late response, here is the issue: https://github.com/typesense/typesense/issues/708
+11
Sep 01, 2022 (12 months ago)
Weilin
Photo of md5-2483161dd2b7cb7554b83a2dccb17f10
Weilin
03:06 PM
Jason Do you think we can get a rough estimate of when this new field could get added for our planning purposes? This would unlock a ton of value for us, and we are hoping that it would be a fast and easy change
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:13 PM
Should be fairly quick. We should be able to get to this next week…
raised_hands1
Sep 02, 2022 (12 months ago)
Weilin
Photo of md5-2483161dd2b7cb7554b83a2dccb17f10
Weilin
12:37 AM
amazing thanks! will look forward to it
Sep 05, 2022 (12 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:19 PM
Weilin I just actually look a look at the implementation, and we restore the original text match score after sorting the results on the bucketed score: https://github.com/typesense/typesense/blob/483ed4d533e8a657784472e3a3066298fa085c44/src/collection.cpp#L1165

So the score you see will already be the raw score. However, there is a bug in certain JSON implementations that truncate the value of the text match score, because it's a large 64-bit value. So in the 0.24 RC builds, we send the text match score as a string value in text_match_info object in the response for each hit.
Sep 07, 2022 (12 months ago)
Weilin
Photo of md5-2483161dd2b7cb7554b83a2dccb17f10
Weilin
02:12 PM
I see, thanks Kishore! I’ll take a look to see if it suits our needs and will follow up here if we do see the bug
+11