#community-help

Docsearch Custom Settings in Typesense Scraper

TLDR Marcos wants to specify token_separators and symbols_to_index without forking the docsearch scraper. Jason suggests opening a GitHub issue to add support for custom settings in the scraper.

Powered by Struct AI
Mar 29, 2023 (6 months ago)
Marcos
Photo of md5-190d44ed75b5c212aad1deb8ffdf1b6c
Marcos
10:17 PM
🧵 Docsearch + token separators
10:18
Marcos
10:18 PM
How can I specify the token_separators and symbols_to_index using the docsearch scrapper?

I can't just update the schema since it'll be overridden next update.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:22 PM
You would have to fork the scraper and make the change
Marcos
Photo of md5-190d44ed75b5c212aad1deb8ffdf1b6c
Marcos
11:29 PM
I'm using the Github actions, I'd have to fork it too. Can't we just use the custom_settings like the Algolia's?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:30 PM
Could you elaborate on that?
Marcos
Photo of md5-190d44ed75b5c212aad1deb8ffdf1b6c
Marcos
11:33 PM
Sure
11:33
Marcos
11:33 PM
Algolia allows you to pass custom settings through the custom_settings options in the docsearch.config.json:
https://github.com/algolia/docsearch-configs/blob/master/configs/docusaurus-2.json#L29-L30
11:34
Marcos
11:34 PM
I expected Typesense to do the same, so I don't need to fork the scrapper and the GH action
11:35
Marcos
11:35 PM
These custom_settings are Algolia's specific collection settings
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:38 PM
Ah I see interesting!
11:39
Jason
11:39 PM
Could you open a GitHub issue for this? I can add support for this in the scraper
Marcos
Photo of md5-190d44ed75b5c212aad1deb8ffdf1b6c
Marcos
11:39 PM
sure