New to TypeSense - Scrapping Websites & Accessing Data
TLDR Raj was unsure about locating and accessing their scrapped data, and Jason guided them to find the collection and use the search endpoint to query it.

May 31, 2023 (4 months ago)
Raj
03:54 PMI installed typesense in my mac and able to search books catalogue from localhost UI portal. Then I scrapped the website using docSearch using docker image. Looks like scrapping is also successful. However I am not able to find following
1. Where is the scrapped documents stored? I looked at
data-dir = /opt/homebrew/var/lib/typesense
but I dont see data over there.2. How can I point my scrapped data to above UI portal so that I can query my scrapped data?
3. I scrapped with index name as support.
curl <https://localhost:8108/collections/support/documents/search?q=gift>
is failing with no matches found
Jason
06:53 PMGET /collections
you’ll see the collection name, and then you can use the search endpoint to query this collectionJason
07:07 PMJun 01, 2023 (3 months ago)
Raj
12:07 AMcurl -X GET '<http://localhost:8108/collections>' -H 'Content-Type: application/json' -H 'X-TYPESENSE-API-KEY: xyz'

Typesense
Indexed 2764 threads (79% resolved)
Similar Threads
Solving Typesense Docsearch Scraper Issues
Sandeep was having issues with Typesense's docsearch scraper and getting fewer results than with Algolia's scraper. Jason helped by sharing the query they use and advised checking the running version of the scraper. The issue was resolved when Sandeep ran the non-base regular docker image.
Trouble with DocSearch Scraper and Pipenv Across Multiple OSs
James ran into errors when trying to build Typesense DocSearch Scraper from scratch, and believes it’s because of a bad Pipfile.lock. Jason attempted to replicate the error, and spent hours trying to isolate the issue but ultimately fixed the problem and copied his bash history for future reference. The conversation touches briefly on the subject of using a virtual machine for testing.



Docsearch Scrapper Metadata Configuration and Filter Problem
Marcos faced issues with Docsearch scrapper not adding metadata attributes and filtering out documents without content. Jason helped fix the issue by updating the scraper and providing filtering instructions.
