Hi folks! :wave: We are currently running a Typese...
# community-help
v
Hi folks! šŸ‘‹ We are currently running a Typesense collection with 3072-dimensional embeddings (float32) and facing significant RAM usage issues as the dataset grows. Current situation: • Schema uses
float[]
with
num_dim: 3072
• Memory consumption is becoming a bottleneck • Considering reducing dimensions to 1024 for better resource efficiency Questions: 1. Migration strategy: What's the recommended approach for changing embedding dimensions from 3072 to 1024 in an existing collection? Can this be done in-place (PQ compression etc) or does it require full re-indexing? 2. Memory optimization: Are there any Typesense-specific configurations or techniques to reduce memory footprint for high-dimensional vectors without changing dimensions? 3. Future considerations: Any plans for built-in vector compression/quantization features in upcoming Typesense versions? Environment: 2 set of embeddings per document (retrieval & classification), 4 GB RAM (~2.8GB used), 70K documents Would appreciate any insights/solution from the community to solve this bottleneck!
k
1. It requires reindexing 2. We currently don't have a way for compressing vector storage. We don't support binarization or PQ encoding yet. 3. Yes we do have plans to add these but it's not on our immediate roadmap yet.
v
Gotcha! Thanks for the reply, looking forward šŸ™‚