Issues With `text_match` Scoring for Search Queries in Typesense
TLDR Colin encountered issues with the text_match
scoring on Typesense v0.23.1. Jason and Kishore Nallan identified a potential issue with numeric overflow in the text match score and applied an unverified patch. The final resolution is unclear.
5
2
1
Jul 18, 2022 (17 months ago)
Colin
05:50 PMtext_match
is calculated on search 🧵Colin
05:51 PM{ query_by: 'fieldA, fieldB', query_by_weights: '20,1' }
it looks like it's calculating the same text_match
score for all the records returned here in the query.Colin
05:51 PMColin
05:51 PMtext_match
score since we have more occurrences of the search term on fieldA
. Is there a way to make frequency effect the text_match
score?Jason
05:56 PMColin
05:57 PMTypesense v0.23.1
Jul 19, 2022 (17 months ago)
Kishore Nallan
03:09 AMPrior to 0.23 we were summing up weights across fields but it led to various edge cases leading to noisy results.
Colin
03:45 PMtext_match
score.Kishore Nallan
04:03 PMColin
04:12 PMColin
04:13 PMKishore Nallan
04:17 PM1
Mark
05:04 PMhere's another example that highlights the problem. the two results have the same text_match score despite "description" designated to receive higher weight
Rebecca
05:21 PMJul 20, 2022 (17 months ago)
Kishore Nallan
09:36 AMa) The very first example in the thread involving the
fox
query: Typesense does not count individual occurences of the tokens since that caused relevancy issues due to keyword stuffing in real-world data sets that can be noisy.b) The second example involving the
javascript
query: since Typesense derives a text match score from the best matched field of a record, the scores are same here. This is something that I agree is not always ideal so we've to see how we can support a more fine-grain scoring that considers additional fields that match.c) The
java
query: I've to check what's happening here as the category field is weighted lower so that record must appear ahead.1
Mark
01:17 PMJason
02:36 PMRebecca
02:39 PM1
Jason
02:45 PM2
Jul 21, 2022 (17 months ago)
Jason
01:56 PM1
Colin
02:34 PMJason
02:35 PMColin
02:43 PMquery_by=title,shortDescription,description
Colin
02:44 PMColin
02:45 PMKishore Nallan
02:49 PMColin
02:52 PM### Why learn JavaScript Errors and Debugging? This course will guide you through the basics of debugging and handling JavaScript errors to build a growth mindset approach to programming and prevent a crash in your applications! ### Outcomes: Learn how to debug your code and learn to predict and handle errors in your web applications. ### Note on Prerequisites: Intermediate JavaScript is a prerequisite, and you should be comfortable with arrays, objects, and looping through arrays.
Jason
03:22 PMJason
03:23 PMColin
03:28 PMJason
03:33 PM.../api/...
endpoint as you search, copy as curl the last request and DM it to me?1
Jason
03:47 PM"query_by":"description,longDescription,organizationId"
in the queryJason
03:48 PM1
Typesense
Indexed 3011 threads (79% resolved)
Similar Threads
Phrase Search Relevancy and Weights Fix
Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.
Adjusting Text Match Score Calculation in TypeSense
Johannes wanted to modify the Text Match Score calculation in TypeSense to improve search results returns. With counsel from Jason and Kishore Nallan, various solutions were proposed, including creating a Github issue, attempting different parameters, and updating Docker to a new version to resolve the matter.
Troubleshooting "drop_tokens_threshold" and Typo Tolerance in Typesense
Joe had issues with "drop_tokens_threshold" = 0 and typo tolerance in Typesense, after which Kishore Nallan provided solutions and clarifications on feature functionality. Their issues with the search result limit and tokens were resolved after discussion and testing.
Troubleshooting Issues with DocSearch Hits and Scraper Configuration
Rubai encountered issues with search result priorities and ellipsis. Jason helped debug the issue and suggested using different versions of typesense-docsearch.js, updating initialization parameters, and running the scraper on a Linux-based environment. The issues related to hits structure and scraper configuration were resolved.
Docker Upgrade and Indexing Data Issues for Travel App
The thread discussed upgrading docker while retaining indexing data and addressed search result ranking issues in an app with collections indexed by attractions, destinations, countries, and users. Kishore Nallan provided guidance on adjusting query parameters and weights to improve search outcomes.