#community-help

Understanding Dataset Sizes and Data Types for Typesense

TLDR Ethan questioned about dataset size limits and data types for Typesense. Jason clarified that as long as the dataset fits the RAM, Typesense works, also adding that Typesense supports only JSONL.

Powered by Struct AI

1

1

9
10mo
Solved
Join the chat
Dec 18, 2022 (10 months ago)
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
03:03 PM
Hey everyone! Pretty basic question here, but I'm wondering what size data sets are appropriate for Typesense? I have a system where I want to search through fairly large transcripts.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:09 PM
As long you can fit the entire dataset in RAM, Typesense should be able to handle it

If X is the size of the dataset, you need 2x-3x RAM to index the data in Typesense
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
04:23 PM
So at the point where you would want to quickly search large data sets, would this involve setting up a database at that point? Or is there no workaround. I can't seem to think of how you would query everything quickly without having access to all of the data in memory.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:25 PM
Yeah using an engine like Typesense that’s optimized for full text search is what you’d have to do if you want performance

1

Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
04:25 PM
Got it
04:25
Ethan
04:25 PM
Lastly, is Typesense built to be able to read in CSV, or does it have to be JSON?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:26 PM
Typesense only supports JSONL, but here’s a one liner to convert CSV to JSONL: https://typesense.org/docs/0.23.1/api/documents.html#import-a-csv-file

1

Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
04:26 PM
Wow it was right there. Now I feel stupid. Thanks Jason!
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:27 PM
No worries! Happy to help!