Reducing Memory Usage in Large Dataset Indexing

TLDR Alan asked for an article on reducing memory usage when indexing large datasets. Jason provided general tips without sharing an article.

Powered by Struct AI
May 11, 2023 (7 months ago)
Photo of md5-6d168f201743aec43607f72d2864612d
03:57 PM
Is there a good article on reducing memory when indexing large datasets?
Photo of md5-8813087cccc512313602b6d9f9ece19f
04:52 PM
I don’t have an article to share, but in general:

• faceting adds additional memory overhead, so be sure to only turn it on for fields you’re actually using in facet_by
• Enabling string sorting on a fields adds memory overhead
• Enabling infix on a field adds memory overhead
• Make sure that you only specify fields you’re searching / filtering / faceting / grouping / sorting on in the schema - you don’t have to mention every field in your document in the collection schema, even though you can send additional fields when indexing in the document. These additional fields will be stored on disk and returned when the document is a hit, and won’t count towards memory consumption