Inconsistent Search Results in Typesense UI vs Dashboard
TLDR Abhishek reports inconsistent search results in the typesense UI vs dashboard integration when using page rank with Docusaurus plugin. Jason suggests creating a GitHub issue while Abhishek seeks clarification on prioritizing exact matches.
Feb 22, 2023 (7 months ago)
Abhishek
11:13 AMAbhishek
11:16 AM {
"url": "https://host.com/glossary/",
"page_rank": 5
},
{
"url": "https://host.com",
"page_rank": 1
}
so all documents parsed from the glossary page will have a higer rank over the rest of the pages on the site.
Abhishek
11:19 AMpage_rank
?Jason
02:39 PMFeb 23, 2023 (7 months ago)
Abhishek
03:36 PMJason
05:30 PMThis happens by default in Typesense.
prioritize_exact_match
is the parameter name, and it’s set to true
by defaultAbhishek
07:58 PMJason
08:24 PMFeb 24, 2023 (7 months ago)
Abhishek
07:51 PMform token
, and for "form token"
. The first few results are -{
"results": [
{
"facet_counts": [],
"found": 26,
"grouped_hits": [
{
"group_key": [
" "
],
"hits": [
{
"document": {
"content": "When you upload a form to a document set template , party templates and field templates are automatically created based on the names of the form tokens . While you can use any names you want for your tokens, following certain naming conventions will make it possible for party templates and field templates to be automatically set up, saving you time",
"hierarchy": {
"lvl0": "Form Token Naming Conventions",
"lvl1": "Form Token Naming Conventions",
"lvl2": null,
"lvl3": null,
"lvl4": null,
"lvl5": null,
"lvl6": null
},
"hierarchy.lvl0": "Form Token Naming Conventions",
"hierarchy.lvl1": "Form Token Naming Conventions",
"id": "570",
"type": "content",
"url": " "
},
"highlight": {
"content": {
"matched_tokens": [
"form",
"token"
],
"snippet": "the names of the <mark>form</mark> <mark>token</mark>s . While you can use",
"value": "When you upload a <mark>form</mark> to a document set template , party templates and field templates are automatically created based on the names of the <mark>form</mark> <mark>token</mark>s . While you can use any names you want for your <mark>token</mark>s, following certain naming conventions will make it possible for party templates and field templates to be automatically set up, saving you time"
}
},
"highlights": [
{
"field": "content",
"matched_tokens": [
"form",
"token"
],
"snippet": "the names of the <mark>form</mark> <mark>token</mark>s . While you can use",
"value": "When you upload a <mark>form</mark> to a document set template , party templates and field templates are automatically created based on the names of the <mark>form</mark> <mark>token</mark>s . While you can use any names you want for your <mark>token</mark>s, following certain naming conventions will make it possible for party templates and field templates to be automatically set up, saving you time"
}
],
"text_match": 1157451471441100923,
"text_match_info": {
"best_field_score": "2211897868288",
"best_field_weight": 15,
"fields_matched": 3,
"score": "1157451471441100923",
"tokens_matched": 2
}
},
{
"document": {
"hierarchy": {
"lvl0": "Form Token Naming Conventions",
"lvl1": "Form Token Naming Conventions",
"lvl2": null,
"lvl3": null,
"lvl4": null,
"lvl5": null,
"lvl6": null
},
"hierarchy.lvl0": "Form Token Naming Conventions",
"hierarchy.lvl1": "Form Token Naming Conventions",
"id": "569",
"type": "lvl1",
"url": " "
},
"highlight": {
"hierarchy": {
"lvl0": "Form Token Naming Conventions",
"lvl1": "Form Token Naming Conventions",
"lvl2": null,
"lvl3": null,
"lvl4": null,
"lvl5": null,
"lvl6": null
}
},
"highlights": [],
"text_match": 1157451471441100922,
"text_match_info": {
"best_field_score": "2211897868288",
"best_field_weight": 15,
"fields_matched": 2,
"score": "1157451471441100922",
"tokens_matched": 2
}
},
...
Abhishek
07:51 PMAbhishek
07:52 PMform token
- 1st screenshotResult with
"form token"
- 2nd screenshotAbhishek
07:54 PMAbhishek
07:55 PMForm token
as the query. I mainly want Form Token
from the Glossary page to be shown as the first result.Feb 27, 2023 (7 months ago)
Abhishek
05:43 AMJason
06:56 PMHmm, I’m surprised to hear this. To make this easier to debug, could you open the network tab in the browser dev console, then do each search query via the UI, and then look for a request to Typesense (to the multi_search endpoint), right click, copy-as-curl and send me that curl command for both?
Abhishek
10:44 PMcurl '' \
-H 'authority: ' \
-H 'accept: application/json, text/plain, */*' \
-H 'accept-language: en-US,en;q=0.9,fr;q=0.8,da;q=0.7' \
-H 'content-type: text/plain' \
-H 'origin: ' \
-H 'referer: ' \
-H 'sec-ch-ua: "Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Linux"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: cross-site' \
-H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
--data-raw '{"searches":[{"collection":"clerky-typesense-search","q":"form toke","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"language:=en && docusaurus_tag:=[default,docs-default-current]"}]}' \
--compressed ;
curl '' \
-H 'authority: ' \
-H 'accept: application/json, text/plain, */*' \
-H 'accept-language: en-US,en;q=0.9,fr;q=0.8,da;q=0.7' \
-H 'content-type: text/plain' \
-H 'origin: ' \
-H 'referer: ' \
-H 'sec-ch-ua: "Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Linux"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: cross-site' \
-H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
--data-raw '{"searches":[{"collection":"clerky-typesense-search","q":"form token","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"language:=en && docusaurus_tag:=[default,docs-default-current]"}]}' \
--compressed
Abhishek
10:45 PMcurl '' \
-H 'authority: ' \
-H 'accept: application/json, text/plain, */*' \
-H 'accept-language: en-US,en;q=0.9,fr;q=0.8,da;q=0.7' \
-H 'content-type: text/plain' \
-H 'origin: ' \
-H 'referer: ' \
-H 'sec-ch-ua: "Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Linux"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: cross-site' \
-H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
--data-raw '{"searches":[{"collection":"clerky-typesense-search","q":"\"form token\"","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"language:=en && docusaurus_tag:=[default,docs-default-current]"}]}' \
--compressed
Abhishek
10:45 PMMar 01, 2023 (7 months ago)
Jason
09:36 PMhierarchy.lvl0
has the word “Form Token Naming Conventions”, so that’s why it gets ranked higher, since hierarchy.lvl0
has the highest priorityJason
09:45 PMquery_by
value in the docusaurus theme configTypesense
Indexed 2779 threads (79% resolved)
Similar Threads
Phrase Search Relevancy and Weights Fix
Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.
Issues with Repeated Words and Hyphen Queries in Typesense API
JinW discusses issues with repeated word queries and hyphen-containing queries in Typesense. Kishore Nallan offers possible solutions. During the discussion, Mr seeks advice on `token_separators` and how to send custom headers. Issues remain with repeated word queries.
Troubleshooting Typesense Setup and Understanding Facets and Keywords
Demitri encountered errors when exploring Typesense for the first time. Jason guided them through troubleshooting and discussed facets, keyword settings, and widget configurations. Helin shared a Python demo app and its source code to help Demitri with their project.
Troubleshooting Issues with DocSearch Hits and Scraper Configuration
Rubai encountered issues with search result priorities and ellipsis. Jason helped debug the issue and suggested using different versions of typesense-docsearch.js, updating initialization parameters, and running the scraper on a Linux-based environment. The issues related to hits structure and scraper configuration were resolved.
Docusaurus Integration - Search Results Ordering & Missing Highlight
Abhishek reported issues with search result ordering and missing highlights in docusaurus integration. Jason explained the result differences between modal and search page. Abhishek submitted a fix for the highlight issue.