#community-help

Finding and Processing 6GB Typesense Book Search Data

TLDR satish inquired about obtaining 6GB data similar to Typesense Book search for a POC. Kishore Nallan directed to Open Library data and processing scripts, while Jason suggested running scripts on the downloaded Open Library dataset to extract necessary fields.

Powered by Struct AI
8
17mo
Solved
Join the chat
Jun 07, 2022 (17 months ago)
satish
Photo of md5-21068ce5c0a7db9d103fad551dbefbc7
satish
09:52 AM
Hi, I have seen the Typesense Book search and I am trying to use the similar data for a POC, Where Can I find the full 6GB data
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:56 AM
We use the dataset form openlibrary: https://openlibrary.org/data
09:57
Kishore Nallan
09:57 AM
See this repo for some processing / indexing scripts used for the demo: https://github.com/typesense/showcase-books-search
satish
Photo of md5-21068ce5c0a7db9d103fad551dbefbc7
satish
10:09 AM
Thanks Kishore. I can see a download of 23 GB not anything with 6 GB. Is it possible to link for download.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
01:39 PM
I don't think there's a 6GB download. IIRC I did have to download the large dataset and extract just the fields I needed to create the demo dataset
satish
Photo of md5-21068ce5c0a7db9d103fad551dbefbc7
satish
01:40 PM
Oh . Do you have the dataset by any chance
01:41
satish
01:41 PM
My bad, will the scripts above given by Kishore will be helpful here
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
01:41 PM
Right. So you’d download the open library dataset and run the scripts against them