Tugay Karaçay
03/03/2021, 6:57 AMJason Bosco
03/03/2021, 6:58 AMTugay Karaçay
03/03/2021, 6:59 AMJason Bosco
03/03/2021, 7:00 AMTugay Karaçay
03/03/2021, 7:02 AMTugay Karaçay
03/03/2021, 7:04 AMJason Bosco
03/03/2021, 7:14 AMJason Bosco
03/03/2021, 7:14 AMJason Bosco
03/03/2021, 7:17 AMAnd another question is do you planning to add a sharding mechanism to Typesense?We do replicate the data across multiple nodes for high availability. However if you're talking about partioning the data and storing a subset on different nodes, we don't have plans for that at the moment. But you can always do application-side sharding, by spinning up multiple clusters and then mapping certain user-id ranges to a particular cluster for eg. You can scale vertically up to 3TB of RAM (AWS offers this for eg), and we haven't had asks to scale up beyond this size of a dataset yet, so we haven't prioritized horizontal scaling.
Tugay Karaçay
03/03/2021, 7:18 AMTugay Karaçay
03/03/2021, 7:19 AMJason Bosco
03/03/2021, 7:20 AMTugay Karaçay
03/03/2021, 7:20 AMYou can scale vertically up to 3TB of RAM (AWS offers this for eg), and we haven’t had asks to scale up beyond this size of a dataset yet, so we haven’t prioritized horizontal scaling.👍
Tugay Karaçay
03/03/2021, 7:21 AMJason Bosco
03/03/2021, 7:21 AMTugay Karaçay
03/03/2021, 7:21 AMJason Bosco
03/03/2021, 7:22 AMTugay Karaçay
03/03/2021, 7:23 AMJason Bosco
03/03/2021, 7:23 AMJason Bosco
03/03/2021, 7:23 AMTugay Karaçay
03/03/2021, 7:24 AMJason Bosco
03/03/2021, 7:24 AMTugay Karaçay
03/03/2021, 7:25 AMThere are no technical limits in Typesense on the number of collections. That said, each collection spins up 4 threads to parallelize searches, so the upper limit really depends on how many CPU cores your cluster hasUnfortunately this will be a big bottleneck for us 😞 We need to redesign our system for that
Tugay Karaçay
03/03/2021, 7:25 AMI actually have a nightly build with the feature! Would you be interested in beta testing it if I give you a docker build?I would love to 🙂
Jason Bosco
03/03/2021, 7:26 AMTugay Karaçay
03/03/2021, 7:27 AMJason Bosco
03/03/2021, 7:28 AMUnfortunately this will be a big bottleneck for us 😞 We need to redesign our system for thatIf you allow your users to define custom fields on the product, then going down the path of one collection per user makes total sense, because the schema is different for each user. v0.20 also has some threading improvements where we'll be able to use a shared thread pool to process requests across multiple collections. So this should allow you to scale to an even higher number of collections
Jason Bosco
03/03/2021, 7:29 AMOne final question sorry for taking too much time of you 🙂 Is there any limit for number of fields?Happy to answer! No, there are no limits on number of fields. As long as you have sufficient RAM to hold the data, Typesense will happily chug along
Jason Bosco
03/03/2021, 7:34 AMTugay Karaçay
03/03/2021, 7:45 AMAndrew Sittermann
03/03/2021, 7:57 AMJason Bosco
03/03/2021, 7:59 AMJason Bosco
03/03/2021, 8:00 AMJason Bosco
03/03/2021, 8:01 AMTugay Karaçay
03/03/2021, 8:09 AMinclude_fields
and facet_by
but there may be 10k fields within a collection and I am not sure about efficiency of this solution 😄Kishore Nallan
03/03/2021, 9:55 AMTugay Karaçay
03/03/2021, 11:26 AMfacet: true
on dynamic fields too so it is not suitable for us now. And also are you considering to add search: false
and index: false
to field definition since we enable auto-schema detection we may want to prevent some fields to be indexed.Kishore Nallan
03/03/2021, 11:34 AMfacet: true
is easy, held back from doing that only because facets can consume memory and so enabling it on every field (especially long text fields like description) will be a huge waste of resources. Thinking of how best to handle that. One way of doing that is to enable facets only on field names ending with a _facet
prefix.
And also are you considering to addWould you know upfront which fields will not need to be searched upon?andsearch: false
to field definitionindex: false
Tugay Karaçay
03/03/2021, 11:54 AMTugay Karaçay
03/03/2021, 12:01 PMThinking of how best to handle that. One way of doing that is to enable facets only on field names ending with aThis is a good solution but not flexible one, using wildcards can be considered. For example on a fields definition we can use following syntax to dynamically match field definition:prefix_facet
[
{
name: 'created_at',
type: 'int64'
},
{
name: '*_auto',
type: 'auto'
},
{
name: '*_fct',
type: 'auto',
facet: true
},
// stringify rest
{
name: '*',
type: 'stringify',
facet: true
}
]
Kishore Nallan
03/03/2021, 12:06 PMindex: false
configuration can also be mentioned in the same way.Gabe O'Leary
03/03/2021, 4:29 PMNow if you useI'm using exactly this! to protect sensitive data & prevent excess data from being transmitted over the wire.to do searches instead of the main search api key, the server will automatically enforce the embeddedscopedApiKey
param and users can't override it.exclude_fields
Jason Bosco
03/03/2021, 6:05 PMGabe O'Leary
03/03/2021, 6:06 PMJason Bosco
03/03/2021, 6:13 PM