#community-help

Inconsistent Search Results in Typesense UI vs Dashboard

TLDR Abhishek reports inconsistent search results in the typesense UI vs dashboard integration when using page rank with Docusaurus plugin. Jason suggests creating a GitHub issue while Abhishek seeks clarification on prioritizing exact matches.

Powered by Struct AI
Feb 22, 2023 (7 months ago)
Abhishek
Photo of md5-39b08f19a4a6e14122188814743e5c3e
Abhishek
11:13 AM
hello, I am using typesense with the docusaurus integration and seeing different results in UI vs typesense search console with page rank. 🧵
11:16
Abhishek
11:16 AM
My config looks like -
    {
      "url": "https://host.com/glossary/",
      "page_rank": 5
    },
    {
      "url": "https://host.com",
      "page_rank": 1
    }

so all documents parsed from the glossary page will have a higer rank over the rest of the pages on the site.
11:19
Abhishek
11:19 AM
When searching for a document via the typesense dashboard, I see the documents in the order of the page rank. But when searching the same queries via the docusaurus search plugin, I see documents in a different order. Does the docusaurus search plugin not consider the page_rank?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
02:39 PM
Looks like we don’t use that in the Typesense docsearch version. But should be an easy update. Could you copy-paste this into a GitHub issue in this repo: https://github.com/typesense/typesense-docsearch.js
Feb 23, 2023 (7 months ago)
Abhishek
Photo of md5-39b08f19a4a6e14122188814743e5c3e
Abhishek
03:36 PM
Thanks! A followup question - how do I do exact phrase matching via the docsearch integration? I understand that I can do it with quotes, but is there a way to make it the default and/or prioritise exact matcher over token matches?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
05:30 PM
> but is there a way to make it the default and/or prioritise exact matcher over token matches
This happens by default in Typesense. prioritize_exact_match is the parameter name, and it’s set to true by default
Abhishek
Photo of md5-39b08f19a4a6e14122188814743e5c3e
Abhishek
07:58 PM
Unfortunately, I don't see the exact matches being prioritised unless I add quotes around the query.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:24 PM
Could you give me an example of a search query (with all the search params -> you’ll see this in the network request), the JSON of the first result that is returned and the JSON of the result you expect to be shown first?
Feb 24, 2023 (7 months ago)
Abhishek
Photo of md5-39b08f19a4a6e14122188814743e5c3e
Abhishek
07:51 PM
hm I notice that the response of the result is exactly the same for for a search term like form token, and for "form token" . The first few results are -
{
  "results": [
    {
      "facet_counts": [],
      "found": 26,
      "grouped_hits": [
        {
          "group_key": [
            ""
          ],
          "hits": [
            {
              "document": {
                "content": "When you upload a  form  to a  document set template ,  party templates  and  field templates  are automatically created based on the names of the  form tokens . While you can use any names you want for your tokens, following certain naming conventions will make it possible for  party templates  and  field templates  to be automatically set up, saving you time",
                "hierarchy": {
                  "lvl0": "Form Token Naming Conventions",
                  "lvl1": "Form Token Naming Conventions",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                },
                "hierarchy.lvl0": "Form Token Naming Conventions",
                "hierarchy.lvl1": "Form Token Naming Conventions",
                "id": "570",
                "type": "content",
                "url": ""
              },
              "highlight": {
                "content": {
                  "matched_tokens": [
                    "form",
                    "token"
                  ],
                  "snippet": "the names of the  <mark>form</mark> <mark>token</mark>s . While you can use",
                  "value": "When you upload a  <mark>form</mark>  to a  document set template ,  party templates  and  field templates  are automatically created based on the names of the  <mark>form</mark> <mark>token</mark>s . While you can use any names you want for your <mark>token</mark>s, following certain naming conventions will make it possible for  party templates  and  field templates  to be automatically set up, saving you time"
                }
              },
              "highlights": [
                {
                  "field": "content",
                  "matched_tokens": [
                    "form",
                    "token"
                  ],
                  "snippet": "the names of the  <mark>form</mark> <mark>token</mark>s . While you can use",
                  "value": "When you upload a  <mark>form</mark>  to a  document set template ,  party templates  and  field templates  are automatically created based on the names of the  <mark>form</mark> <mark>token</mark>s . While you can use any names you want for your <mark>token</mark>s, following certain naming conventions will make it possible for  party templates  and  field templates  to be automatically set up, saving you time"
                }
              ],
              "text_match": 1157451471441100923,
              "text_match_info": {
                "best_field_score": "2211897868288",
                "best_field_weight": 15,
                "fields_matched": 3,
                "score": "1157451471441100923",
                "tokens_matched": 2
              }
            },
            {
              "document": {
                "hierarchy": {
                  "lvl0": "Form Token Naming Conventions",
                  "lvl1": "Form Token Naming Conventions",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                },
                "hierarchy.lvl0": "Form Token Naming Conventions",
                "hierarchy.lvl1": "Form Token Naming Conventions",
                "id": "569",
                "type": "lvl1",
                "url": ""
              },
              "highlight": {
                "hierarchy": {
                  "lvl0": "Form Token Naming Conventions",
                  "lvl1": "Form Token Naming Conventions",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                }
              },
              "highlights": [],
              "text_match": 1157451471441100922,
              "text_match_info": {
                "best_field_score": "2211897868288",
                "best_field_weight": 15,
                "fields_matched": 2,
                "score": "1157451471441100922",
                "tokens_matched": 2
              }
            },
...

07:51
Abhishek
07:51 PM
but what I see in the UI is not the same.
07:52
Abhishek
07:52 PM
Result with form token - 1st screenshot
Result with "form token" - 2nd screenshot
Image 1 for Result with `form token` - 1st screenshot
Result with `"form token"` - 2nd screenshot
07:54
Abhishek
07:54 PM
Image 1 for
07:55
Abhishek
07:55 PM
My point is that if exact matches were prioritised, I would see results from the second screenshot before the first when using Form token as the query. I mainly want Form Token from the Glossary page to be shown as the first result.
Feb 27, 2023 (7 months ago)
Abhishek
Photo of md5-39b08f19a4a6e14122188814743e5c3e
Abhishek
05:43 AM
Jason bump
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
06:56 PM
&gt; hm I notice that the response of the result is exactly the same for for a search term like form token, and for “form token” .
Hmm, I’m surprised to hear this. To make this easier to debug, could you open the network tab in the browser dev console, then do each search query via the UI, and then look for a request to Typesense (to the multi_search endpoint), right click, copy-as-curl and send me that curl command for both?
Abhishek
Photo of md5-39b08f19a4a6e14122188814743e5c3e
Abhishek
10:44 PM
curl '' \
  -H 'authority: ' \
  -H 'accept: application/json, text/plain, */*' \
  -H 'accept-language: en-US,en;q=0.9,fr;q=0.8,da;q=0.7' \
  -H 'content-type: text/plain' \
  -H 'origin: ' \
  -H 'referer: ' \
  -H 'sec-ch-ua: "Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: cross-site' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --data-raw '{"searches":[{"collection":"clerky-typesense-search","q":"form toke","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"language:=en && docusaurus_tag:=[default,docs-default-current]"}]}' \
  --compressed ;
curl '' \
  -H 'authority: ' \
  -H 'accept: application/json, text/plain, */*' \
  -H 'accept-language: en-US,en;q=0.9,fr;q=0.8,da;q=0.7' \
  -H 'content-type: text/plain' \
  -H 'origin: ' \
  -H 'referer: ' \
  -H 'sec-ch-ua: "Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: cross-site' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --data-raw '{"searches":[{"collection":"clerky-typesense-search","q":"form token","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"language:=en && docusaurus_tag:=[default,docs-default-current]"}]}' \
  --compressed
10:45
Abhishek
10:45 PM
curl '' \
  -H 'authority: ' \
  -H 'accept: application/json, text/plain, */*' \
  -H 'accept-language: en-US,en;q=0.9,fr;q=0.8,da;q=0.7' \
  -H 'content-type: text/plain' \
  -H 'origin: ' \
  -H 'referer: ' \
  -H 'sec-ch-ua: "Not?A_Brand";v="8", "Chromium";v="108", "Google Chrome";v="108"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: cross-site' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36' \
  --data-raw '{"searches":[{"collection":"clerky-typesense-search","q":"\"form token\"","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"language:=en && docusaurus_tag:=[default,docs-default-current]"}]}' \
  --compressed
Mar 01, 2023 (7 months ago)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:36 PM
In your first screenshot, the field hierarchy.lvl0 has the word “Form Token Naming Conventions”, so that’s why it gets ranked higher, since hierarchy.lvl0 has the highest priority
09:45
Jason
09:45 PM
If you want to change this field priority, you can set a custom query_by value in the docusaurus theme config