can anyone help me out why the hits structure like...
# community-help
r
can anyone help me out why the hits structure like that ? I think
add sdk implementation
have to be on top .
j
Could you open the browser’s network inspector, do a search, then look for a request to multi_search, then copy-as-curl the request and paste it here? Could you also copy the response for the API call and paste it here?
r
curl 'http://localhost:8108/multi_search?x-typesense-api-key=xyz' \ -H 'Accept: application/json, text/plain, */*' \ -H 'Accept-Language: en-GB,en-US;q=0.9,en;q=0.8' \ -H 'Cache-Control: no-cache' \ -H 'Connection: keep-alive' \ -H 'Content-Type: text/plain' \ -H 'Origin: https://3238-180-151-109-60.in.ngrok.io' \ -H 'Pragma: no-cache' \ -H 'Sec-Fetch-Dest: empty' \ -H 'Sec-Fetch-Mode: cors' \ -H 'Sec-Fetch-Site: cross-site' \ -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36' \ -H 'sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"' \ -H 'sec-ch-ua-mobile: ?0' \ -H 'sec-ch-ua-platform: "macOS"' \ --data-raw '{"searches":[{"collection":"Developer_Docs","q":"add sdk impl","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"filter_by":"product_tag:=payment-page_android"}]}' \ --compressed
Copy code
{
  "results": [
    {
      "facet_counts": [],
      "found": 1,
      "grouped_hits": [
        {
          "group_key": [
            "<https://3238-180-151-109-60.in.ngrok.io/payment-page/android/base-sdk-integration/getting-sdk#add-sdk-implementation>"
          ],
          "hits": [
            {
              "document": {
                "anchor": "add-sdk-implementation",
                "content": "After adding Assets Plugin and Hyper SDK to your project, please ensure you do a gradle sync and clean build",
                "hierarchy": {
                  "lvl0": "1. Getting the SDK",
                  "lvl1": "Add SDK implementation",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                },
                "hierarchy.lvl0": "1. Getting the SDK",
                "hierarchy.lvl1": "Add SDK implementation",
                "id": "158",
                "type": "content",
                "url": "<https://3238-180-151-109-60.in.ngrok.io/payment-page/android/base-sdk-integration/getting-sdk#add-sdk-implementation>"
              },
              "highlight": {
                "content": {
                  "matched_tokens": [
                    "SDK"
                  ],
                  "snippet": "After adding Assets Plugin and Hyper <mark>SDK</mark> to your project, please ensure you do a gradle sync and clean build",
                  "value": "After adding Assets Plugin and Hyper <mark>SDK</mark> to your project, please ensure you do a gradle sync and clean build"
                }
              },
              "highlights": [
                {
                  "field": "content",
                  "matched_tokens": [
                    "SDK"
                  ],
                  "snippet": "After adding Assets Plugin and Hyper <mark>SDK</mark> to your project, please ensure you do a gradle sync and clean build",
                  "value": "After adding Assets Plugin and Hyper <mark>SDK</mark> to your project, please ensure you do a gradle sync and clean build"
                }
              ],
              "text_match": 1736172785157800000,
              "text_match_info": {
                "best_field_score": "3315687620864",
                "best_field_weight": 14,
                "fields_matched": 3,
                "score": "1736172785157800051",
                "tokens_matched": 3
              }
            },
            {
              "document": {
                "anchor": "add-sdk-implementation",
                "content": "To inject SDK as dependency in your application, include it in your  application build.gradle  dependencies",
                "hierarchy": {
                  "lvl0": "1. Getting the SDK",
                  "lvl1": "Add SDK implementation",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                },
                "hierarchy.lvl0": "1. Getting the SDK",
                "hierarchy.lvl1": "Add SDK implementation",
                "id": "157",
                "type": "content",
                "url": "<https://3238-180-151-109-60.in.ngrok.io/payment-page/android/base-sdk-integration/getting-sdk#add-sdk-implementation>"
              },
              "highlight": {
                "content": {
                  "matched_tokens": [
                    "SDK"
                  ],
                  "snippet": "To inject <mark>SDK</mark> as dependency in your application, include it in your  application build.gradle  dependencies",
                  "value": "To inject <mark>SDK</mark> as dependency in your application, include it in your  application build.gradle  dependencies"
                }
              },
              "highlights": [
                {
                  "field": "content",
                  "matched_tokens": [
                    "SDK"
                  ],
                  "snippet": "To inject <mark>SDK</mark> as dependency in your application, include it in your  application build.gradle  dependencies",
                  "value": "To inject <mark>SDK</mark> as dependency in your application, include it in your  application build.gradle  dependencies"
                }
              ],
              "text_match": 1736172785157800000,
              "text_match_info": {
                "best_field_score": "3315687620864",
                "best_field_weight": 14,
                "fields_matched": 3,
                "score": "1736172785157800051",
                "tokens_matched": 3
              }
            },
            {
              "document": {
                "anchor": "add-sdk-implementation",
                "hierarchy": {
                  "lvl0": "1. Getting the SDK",
                  "lvl1": "Add SDK implementation",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                },
                "hierarchy.lvl0": "1. Getting the SDK",
                "hierarchy.lvl1": "Add SDK implementation",
                "id": "156",
                "type": "lvl1",
                "url": "<https://3238-180-151-109-60.in.ngrok.io/payment-page/android/base-sdk-integration/getting-sdk#add-sdk-implementation>"
              },
              "highlight": {
                "hierarchy": {
                  "lvl0": "1. Getting the SDK",
                  "lvl1": "Add SDK implementation",
                  "lvl2": null,
                  "lvl3": null,
                  "lvl4": null,
                  "lvl5": null,
                  "lvl6": null
                }
              },
              "highlights": [],
              "text_match": 1736172785157800000,
              "text_match_info": {
                "best_field_score": "3315687620864",
                "best_field_weight": 14,
                "fields_matched": 2,
                "score": "1736172785157800050",
                "tokens_matched": 3
              }
            }
          ]
        }
      ],
      "out_of": 346,
      "page": 1,
      "request_params": {
        "collection_name": "Developer_Docs_1679258843",
        "per_page": 10,
        "q": "add sdk impl"
      },
      "search_cutoff": false,
      "search_time_ms": 1
    }
  ]
}
j
In your docsearch.js initialization code, could you add:
Copy code
typesenseSearchParameters: {
  filter_by: '...',
  sort_by: 'item_priority:desc',
},
And see if that helps?
If it does, then I can update the docsearch.js defaults to set this automatically… Let me know
r
great that's worked . thanks a lot @Jason Bosco 🙏
j
Ok great! I’m going to push out an update to docsearch.js that sets that by default, so you don’t have to do this manually. I’ll keep you posted shortly.
r
thanks . and anything about the blank space ? here you can see I have some data this page . still get blank . It happen when we search as lvl0
j
Could you share the response from Typesense for that?
r
sure
can you please tell once the github link to send the large file
j
j
May I know if you’re using docsearch.js or docsearch-react?
r
docsearch.js
j
Could you upgrade to
3.4.0-0
of
typesense-docsearch.js
and then try again?
Could you also remove the sort_by parameter that you added manually?
r
sure
can you please help me out how to upgrade docsearch.js
j
in package.json
Oh wait, are you using the script tag directly?
r
yes
r
yes
j
You want to change this line:
Copy code
<!-- Before the closing body -->
<script src="<https://cdn.jsdelivr.net/npm/typesense-docsearch.js@3.4.0-0>"></script>
In the docs it currently says
3.0.1
for the script tag
Change that to
3.4.0-0
r
awesome that works
j
The blank result is gone too?
Could you show me a screenshot of how the results looks now?
r
here is the result
j
Awesome 🙌
CC: @Abhishek Raj Thank you for that PR! ^ I’ll publish this change for you in the docusaurus theme as well by EOD
typesense 1
🙂 1
r
thanks a lot , convert the blank space to lvl0
👍 1
and one thing is it behaves like as expected ? because I searched a key & got the result on the top but I also got the example or text which is not required . you can see the document under backdrop . and can we add
(...)
at the start of hits if the match result are on a long text . so it's easy to understand such that there have some text before that
j
because I searched a key & got the result on the top but I also got the example or text which is not required
The scraper just shows all content that is on the page, as specified by the css selectors. If you don’t want examples to show you, you want to exclude that via css selectors
and can we add (...) at the start of hits if the match result are on a long text . so it’s easy to understand such that there have some text before that
should be shown at the end of the hits technically… looks like that’s hidden in the UI. In your docsearch initialization code, could you try adding this:
Copy code
typesenseSearchParameters: {
  filter_by: '...',
  highlight_affix_num_tokens: 3,
},
r
nothing changed after adding this
j
Could you share the response from Typesense?
message has been deleted
j
Hmm that doesn’t seem like the same API response for the search query in your screenshot
Could you open the network inspector first, then type in that search query and then send me the api response of the last call to multi_search?
r
sure
j
Ah, could you also set
snippet_threshold: 5
?
r
in configJS?
j
Copy code
typesenseSearchParameters: {
  filter_by: '...',
  snippet_threshold: 5,
},
in the docsearch initialization code
r
nothing changed . can't see anything like
...
on hits
j
Hmm, could you share the curl request to Typesense and the response once again?
Did anything change in the UI at all or does it still look the same?
r
yes
I added
snippet_threshold: 5,
but still getting same result like previous one
Copy code
curl '<http://localhost:8108/multi_search?x-typesense-api-key=xyz>' \
  -H 'Accept: application/json, text/plain, */*' \
  -H 'Accept-Language: en-GB,en-US;q=0.9,en;q=0.8' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: text/plain' \
  -H 'Origin: <http://localhost>' \
  -H 'Pragma: no-cache' \
  -H 'Referer: <http://localhost/>' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: same-site' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "macOS"' \
  --data-raw '{"searches":[{"collection":"Developer_Docs","q":"to be present","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"sort_by":"item_priority:desc","filter_by":"product_tag:=payment-page_android","snippet_threshold":5}]}' \
  --compressed
j
Could you run ngrok for port 8108 temporarily? I’d like to be able to reach that Typesense server
I will run it for 8108
j
ok, could you share that url with me?
r
sure give me a moment
j
Let’s try this:
Copy code
"snippet_threshold": 5,
      "highlight_affix_num_tokens": 3
r
getting same results
j
Hmm the response is definitely different from the API
Could you share a screenshot?
of the UI
r
message has been deleted
j
That is different from this right:

https://typesense-community.slack.com/files/U04RHF46W5B/F05044HG4M6/screenshot_2023-03-21_at_1.23.31_am.png

r
yes
j
So it’s working now yeah?
Or did I misunderstand the issue
r
no . you can see there is no
...
for the text . actually I am trying to say that the screenshot are taken different time but getting same result that's why it's look like same
see on
to be present
,the 2nd hits . it's a long text that's why I want to add
...
at start
and 2nd one is more matching then 1st one . but it not in the top
j
Ahh got it, you’re talking about the ellipsis specifically…
Looking into it
To debug the order of the results, could you upgrade your Typesense server to 0.24.1.rc10 and let me know?
r
as of now the order of result are fine . I want to say for text hits like this ,start or end with
...
. I got this from https://docusaurus.io/ site
j
Could you upgrade to
3.4.0-1
and check now?
r
getting error on the given version
j
Could you try with
3.4.0-8
r
yea it's working
sometime before it's working fine but now facing this error
searchbar braking while opening the popup of searchbar
j
Could you share curl request and response for that screenshot?
r
Copy code
curl '<http://localhost:8108/multi_search?x-typesense-api-key=xyz>' \
  -H 'Accept: application/json, text/plain, */*' \
  -H 'Accept-Language: en-GB,en-US;q=0.9,en;q=0.8' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: text/plain' \
  -H 'Origin: <http://localhost:3000>' \
  -H 'Pragma: no-cache' \
  -H 'Referer: <http://localhost:3000/>' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: same-site' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Chromium";v="110", "Not A(Brand";v="24", "Google Chrome";v="110"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "macOS"' \
  --data-raw '{"searches":[{"collection":"Developer_Docs","q":"session api","query_by":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","include_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content,anchor,url,type,id","highlight_full_fields":"hierarchy.lvl0,hierarchy.lvl1,hierarchy.lvl2,hierarchy.lvl3,hierarchy.lvl4,hierarchy.lvl5,hierarchy.lvl6,content","group_by":"url","group_limit":3,"sort_by":"item_priority:desc","snippet_threshold":5,"highlight_affix_num_tokens":3,"filter_by":"product_tag:=payment-page_android"}]}' \
  --compressed
https://gist.github.com/rubai99/8f21fb34638fd68f0683137d4e6ee810 . sometimes works ,sometimes braking . now it's breaking
j
Could you update your script tag to this:
Copy code
<script src="<https://cdn.jsdelivr.net/npm/typesense-docsearch.js@3.4.0-8/dist/umd/index.js>"></script>
And then try replicating the same error in that screenshot and post a stack trace? (This is hopefully pulls in the source-map and shows a proper stack trace)
r
this error getting when I click on searchbar , it's happens for this version only . at first when I add the version it's working fine . after 5-6 min I got an error . then after 1 hr I add this version again but also get same thing , some time worked but suddenly getting this error . and now I am getting the error also
j
Could you click on “Snippet.js1452” in that stack trace and post a screenshot?
r
j
Looks like the Typesense URL is pointing to localhost, so I can’t see any search results because of that
r
message has been deleted
I think localhost URL is not an issue . because it's working for some times suddenly facing this things
j
Yup yup, that’s a different issue, just with the ngrok URL you shared earlier
I’m pushing a potential fix for the root cause.. let’s see
r
sure
j
Could you try with
3.4.0-9
?
r
sure
now it's working fine you can check with my ngrock url
j
The ngrok URL still doesn’t work for me, because it’s trying to connect to localhost:8108 to talk to Typesense
To get it to work, you would have to start a separate ngrok tunnel for port 8108, then use that ngrok URL in the docsearch init code as the typesense hostname… But that’s too much effort, so that’s fine. Happy to hear that it works now!
r
message has been deleted
❤️ 1
thanks @Jason Bosco. till now I am asked for lot of issues , sorry for that . now it's working fine . great work👏
🙌 1
j
That’s great to hear! Thank you for helping catch all these issues!
❤️ 1
r
hi @Jason Bosco is this possible to run multiple collection ? we have two product
payment-page
&
upi-inapp
in our documentation , suppose 1st time we run the scraper for collection
Developer_Docs_upi-inapp
and again run the scraper for other collection of
Developer_Docs_payment-page
, so can we access both collection in a single documentation , the benefit of this is when anything change happens for a product then we can scrape again for this particular product 's collection only . so here we don't need to run the scraper every product's collection . for reference you can check our documentation https://docs.juspay.in/
j
Unfortunately this is not possible to do with the scraper - it creates a whole new collection each time. So you would have to fork the scraper and update it appropriately, if you want to do partial scraping into the same collection
👍 1
r
@Jason Bosco is there any API to run the scraper in production , and what are changes needed in .env for production
Copy code
TYPESENSE_API_KEY=xyz
TYPESENSE_HOST=host.docker.internal
TYPESENSE_PORT=8108
TYPESENSE_PROTOCOL=http
hey @Jason Bosco anything about that ?
j
You would have to use something like say AWS Fargate (or any docker-based runtime even in your CI pipeline) to run the scraper using the docker image
On Typesense Cloud, we only host the Typesense cluster itself - you still need to run the scraper in your infrastructure
If you’re using Typesense Cloud, the .env file would look something like this:
Copy code
TYPESENSE_API_KEY=<GENERATED_FROM_DASHBOARD>
TYPESENSE_HOST=<http://xxxx.a1.typesense.net|xxxx.a1.typesense.net>
TYPESENSE_PORT=443
TYPESENSE_PROTOCOL=https
The host and api key will be generated once you provision a cluster
r
and anything about chromedriver path ?
j
You can leave that as the default, there’s a chrome executable inside the docker image we publish
r
so we have to build an api to run the scraper for production
j
You can just run the docker command directly
For eg, here’s how we call AWS Fargate that runs the docker command for us for the Typesense docs website: https://github.com/typesense/typesense-website/blob/cca16595a480dc880145bf8b01b8464476ba051e/docs-site/package.json#L12
r
can you please have a look why getting the error ?? the error getting while build dockerfile:base from typesense-docsearch-scraper can i change the version to 111.0.5563.110-1 . after changing the version can it be effects anything
and getting this error while run the scraper . what is the issue here ?
j
You need to build the scraper on a Linux machine with intel cpu. Building it on a mac, especially an M1 doesn’t seem to work
Any reason you’re not using the prebuilt Docker image we’ve published?
r
we made some changes that's why not using pre built docker image. we used dynamic config
then may I have change anything in the code for pod deployment for build the scraper
j
I haven’t been able to get the scraper to build on M1. So you have to spin up a Linux VM and build it from there
r
that's for the 1st time . after that if anything change on docs we have to scrape the docs again . what should we do for that case ?
j
You have to push the docker image you build to a docker registry, and then pull the pre-built image from there anytime you want to scrape
r
for typesense can we change
host='host.docker.internal'
to
host='localhost'
, cause we don't use docker as of now to run the scraper . we run it from VS code via an API and getting this error
j
Yes, you can change that to any hostname, including localhost in the
.env
file you’re using
r
sitemap not working for me