Hello, we are trying to determine if it is better ...
# community-help
v
Hello, we are trying to determine if it is better to denormalize our collections to avoid doing a two-stage search or if we should keep the collections in typesense normalized. The trade-off is between record size and performance. What if we self-hosted, will the performance issue be mitigated? Are there any other considerations we are missing?
j
In general storing data in a denormalized way is more performant in Typesense. We only store a given field's value once in the index, so it is very memory efficient especially if you have repeated values in your docs. re: self-hosted vs Cloud, performance-wise there should be no difference. In fact we run the same open source version of Typesense to power Typesense Cloud.
v
Thanks @Jason Bosco - this helps regarding how to build our collections! The performance issue I was referring to is to do with latency rather than of Typesense Cloud
j
Typesense Cloud has a presence in 20 geo regions, and coupled with the Search Delivery Network feature, you should be able to get pretty low latencies, especially if you're sending queries directly from the browser/app to Typesense. If you're sending queries from your frontend to your backend and then to Typesense, then running Typesense in the same network as your backend will have the lowest latency. You could pick a region that's closest to your backend, and that will add may be 10-20ms of latency.
This is just network latency btw. Search (CPU) processing time will be identical
v
Thank you - this helps! Where can I get some documentation on
Copy code
we only store a given field's value once in the index
j
That particular part is not documented anywhere, but it's essentially an inverted index data structure: https://en.wikipedia.org/wiki/Inverted_index
v
Thank you Jason! Appreciate your quick help on this!
j
Happy to help!