Mar 17, 2022 (18 months ago)
07:43 AM
Hi Kishore Nallan 👋
Attended your awesome talk today in GitHub, I asked few Qns related to RAFT if you remember but I’ve a bunch of other Qns like Is Trie always maintained in memory, how do you ensure durability of trie while Indexing, Is sharding of tries also possible etc. Can you point me to any design doc or something which I can read to get more info or point me to relevant code folders where I can dig up info myself?
07:48 AM
👋 Glad you liked the talk.

1. All indexing data structures are stored in-memory, including the Trie. Here's the trie implementation, which is forked off a simpler library: https://github.com/typesense/typesense/blob/master/src/art.cpp
2. The trie is reconstructed on start, only raw documents are stored on disk. This allows us to modify / introduce new datastructures without the baggage of migrating on-disk structures, which can be cumbersome. The downside is that there is some "boostrapping" time as the indexes are built from scratch from the raw documents. But this is again a trade-off chosen specifically for the kind of uses cases and datasets we've chosen to support.