#community-help

Understanding Data Storage in Typesense

TLDR Ethan wanted information on how to index large amounts of data. Jason guided that Typesense is for secondary data storage and all data for search results must be in Typesense.

Powered by Struct AI

1

Jan 11, 2023 (11 months ago)
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
10:05 PM
So considering only so much data can be stored in RAM and the pricing scales with that, how does one feasibly "index" lots of data?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:07 PM
Could you define “lots of data”?
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
10:08 PM
Well honestly, I haven't worked with databases or large sets of data much at all. But a ballpark would be something like 100GB?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:09 PM
We have users store that much data in Typesense… But since Typesense is an in-memory store, you’d need at least 200GB of RAM to index all the data.
10:10
Jason
10:10 PM
In Typesense Cloud, we have clusters with up to 1TB of RAM
10:11
Jason
10:11 PM
Of course, at a certain point it becomes a question of cost-benefit - whether you need the performance that an in-memory datastore gives you, compared to the cost
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
10:11 PM
Right! I guess I was wondering if there were ways to incorporate a primary database that stores all of the data on disk and use Typesense as secondary.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:11 PM
You definitely want to do that, regardless of this question.
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
10:11 PM
Ah okay got it. So exclusively Typesense is in-memory.

1

Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:12 PM
Typesense is only meant to be used as a secondary data store. You don’t want to put your only copy of data in Typesense. Instead you’d sync a copy of the data you want to search on into Typesense for search
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
10:13 PM
Ah right, more specifically I meant a primary database to store all data since it's more feasible to hold large amounts of data that way, and somehow incorporate Typesense as a secondary that's capable of loading chunks or something.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:14 PM
> capable of loading chunks or something.
This is not possible with Typesense. You would have to put at least all the data you want to surface in search results in Typesense
Ethan
Photo of md5-85acee380db5007c516a932d917dfa74
Ethan
10:14 PM
Got it. Misunderstanding on my part. Thank you for the clarification!
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:14 PM
👍

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community