Adjusting Text Match Score Calculation in TypeSense
TLDR Johannes wanted to modify the Text Match Score calculation in TypeSense to improve search results returns. With counsel from Jason and Kishore Nallan, various solutions were proposed, including creating a Github issue, attempting different parameters, and updating Docker to a new version to resolve the matter.
3
May 24, 2022 (19 months ago)
Johannes
11:45 AMJason
11:46 AMJohannes
11:47 AMJason
11:51 AMhttps://typesense.org/docs/0.22.2/api/documents.html#ranking-parameters
Jason
11:51 AMJohannes
11:59 AMvalue
. I want to make sure that when I start typing e.g. "a" that then only terms starting with "a" are first shown, no matter if there are other terms that include more "a" but not in the beginning.Jason
12:03 PMJohannes
12:04 PMJason
12:27 PMJohannes
12:41 PMJohannes
12:42 PMJason
01:03 PMCould you try this on 0.23.0.rc69?
Johannes
01:18 PMJason
01:37 PMMay 25, 2022 (19 months ago)
Johannes
09:28 AMJason
12:58 PMJohannes
03:27 PMJason
03:59 PMJohannes
04:00 PMJason
04:03 PMJohannes
04:04 PMMay 26, 2022 (19 months ago)
Johannes
09:42 AMJason
06:57 PM0.24.0.rc1
(this should be available on docker hub) and let me know? We made some changes to relevance algorithms there, which I think will help with your use case. This build also has a change to highlighting, where single characters are highlighted, instead of the whole work on match.May 30, 2022 (19 months ago)
Johannes
11:54 AMJason
07:10 PMquery_by: 'first_word'
and these are the results I see. Does this line up with what you're looking to do?May 31, 2022 (19 months ago)
Johannes
09:04 AMJason
03:07 PMJohannes Ok here's another way to do this: you want to index a new field called say "search_string" and then remove all spaces when you create this field, at indexing time.
So for eg, you would index "Anterior part of the inferior surface of cerebrum" as:
{
search_string: "Anteriorpartoftheinferiorsurfaceofcerebrum",
display_words: "Anterior part of the inferior surface of cerebrum"
}
And then set these search params:
{
...
query_by: "search_string",
highlight_full_fields: "display_words"
}
Jason
03:08 PMJohannes
03:13 PMJason
03:13 PMJason
03:14 PMJohannes
03:17 PMJason
03:20 PMJohannes
03:25 PMJason
03:26 PMYup yup. That's the general use-case.
Jason
04:27 PMFor eg, if there were two records with title:
"Function of the brain"
"Brain function"
and the search query is "Brai", this feature would rank the results as:
1. "Brain function"
2. "Function of the brain"
Since "Brain" appears earlier in the field in result #1.
Key thing is that word position is a ranking signal, and doesn't exclude any results. But in your use-case it sounds like you'd want to not show #2 at all, since it doesn't start with "Brai" in the first word right?
Jun 01, 2022 (19 months ago)
Johannes
07:41 AMa b a b
a b b a
a b b b
b a b a
b a b b
b b a b
b b b a
1
Johannes
02:47 PMa b a
a b
b b a b a b a
Basically, the distance of the first word is important, and only when it is the same, then the amount or even the distances of the other word are taken into account.
Jason
05:37 PM1
Jun 07, 2022 (19 months ago)
Kishore Nallan
09:41 AMJohannes
01:26 PMJun 08, 2022 (19 months ago)
Kishore Nallan
01:02 PMtypesense/typesense:0.24.0.rc2
Docker build. You need to send a prioritize_token_position=true
flag to the search query to enable this feature.Johannes
01:02 PM1
Jun 09, 2022 (19 months ago)
Johannes
08:39 AMJohannes
08:41 AMKishore Nallan
09:17 AMJohannes
09:18 AMKishore Nallan
09:39 AMTypesense
Indexed 3005 threads (79% resolved)
Similar Threads
Phrase Search Relevancy and Weights Fix
Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.
Resolving Typesense Query Issues
Todd had queries regarding Typesense operation. Jason clarified Typesense's default behavior and provided a recommendation to enhance results ranking based on relevance and recency.
Troubleshooting Issues with DocSearch Hits and Scraper Configuration
Rubai encountered issues with search result priorities and ellipsis. Jason helped debug the issue and suggested using different versions of typesense-docsearch.js, updating initialization parameters, and running the scraper on a Linux-based environment. The issues related to hits structure and scraper configuration were resolved.
Troubleshooting Typesense Setup and Understanding Facets and Keywords
Demitri encountered errors when exploring Typesense for the first time. Jason guided them through troubleshooting and discussed facets, keyword settings, and widget configurations. Helin shared a Python demo app and its source code to help Demitri with their project.
Issues With `text_match` Scoring for Search Queries in Typesense
Colin encountered issues with the `text_match` scoring on Typesense v0.23.1. Jason and Kishore Nallan identified a potential issue with numeric overflow in the text match score and applied an unverified patch. The final resolution is unclear.