Discussing Document Indexing Speeds and Typesense Features
TLDR Thomas asks about the speed of indexing and associated factors. The conversation reveals that larger batch sizes and NVMe disk usage can improve speed, but the index size is limited by RAM. Jason shares plans on supporting nested fields, and they explore a solution for products in multiple categories and catalogs.
3
1
1
Feb 23, 2022 (23 months ago)
Thomas
04:11 PMHarrison
04:12 PM1
Thomas
04:13 PMHarrison
04:14 PMThomas
04:14 PMThomas
04:14 PMThomas
04:14 PMThomas
04:14 PMJason
04:15 PMAs another data point, I've indexed 2.2M docs in 3.6 minutes on a 4vCPU server
Thomas
04:15 PMThomas
04:15 PMJason
04:15 PMThomas
04:16 PM1
Kishore Nallan
04:16 PMHarrison
04:16 PM1
Thomas
04:17 PMHarrison
04:17 PM1
Thomas
04:17 PMJason
04:17 PMThomas
04:17 PMJason
04:18 PMHarrison
04:18 PMJason
04:18 PMThomas
04:18 PMJason
04:18 PMHarrison
04:18 PMThomas
04:19 PMThomas
04:19 PMJason
04:19 PMHarrison
04:19 PMHarrison
04:19 PMJason
04:20 PMThomas
04:20 PMHarrison
04:20 PMThomas
04:20 PMJason
04:20 PMThomas
04:21 PMJason
04:21 PMThomas
04:21 PMThomas
04:21 PMHarrison
04:22 PMThomas
04:23 PMJason
04:23 PMJason
04:24 PMYes for sure, probably in the next few releases. Until then, here's a workaround: https://typesense.org/docs/0.22.2/api/collections.html#indexing-nested-fields
Thomas
04:27 PMThomas
04:27 PMJason
04:28 PMThomas
04:31 PMJason
04:31 PMJason
04:32 PMThomas
04:32 PMJason
04:32 PMThomas
04:33 PMJason
04:34 PMcatalog_ids: [1,4,6]
field in each productJason
04:34 PMThomas
04:38 PMThomas
04:38 PMJason
04:40 PMThomas
04:42 PMThomas
04:43 PMJason
04:43 PM1
Thomas
05:11 PMJason
05:24 PMTypesense
Indexed 3005 threads (79% resolved)
Similar Threads
Optimizing Bulk Indexing and Reducing RAM Usage in Typesense
Timon experienced issues with Typesense becoming unresponsive during bulk indexing and sought advice. Jason recommended larger import requests and adjusting the client-side timeout allowance, revealing a need to increase RAM allocation for Docker. Kishore Nallan undertook to find ways to optimize memory usage, particularly for geopoint indexing.
Troubleshooting Typesense Document Import Error
Christopher had trouble importing 2.1M documents into Typesense due to memory errors. Jason clarified the system requirements, explaining the correlation between RAM and dataset size, and ways to tackle the issue. They both also discussed database-like query options.
Discussions on Typesense, Collections, and Dynamic Fields
Tugay shares plans to use Typesense for their SaaS platform and asks about collection sizes and sharding. Jason clarifies Typesense's capabilities and shares a beta feature. They discuss using unique collections per customer and new improvements. Kishore Nallan and Gabe comment on threading and data protection respectively.
Understanding Indexing and Search-As-You-Type In Typesense
Steven had queries about indexing and search-as-you-type in Typesense. Jason clarified that bulk updates are faster and search-as-you-type is resource intensive but worth it. The discussion also included querying benchmarks and Typesense's drop_tokens_threshold parameter, with participation from bnfd.
Multiple Filters and JSON Requests in Typesense
Manish asked about multiple filter_by arguments, JSON input, and using multisearch. Jason offered typesense documentation links, examples, and how to use JSON formatted requests with multisearch. Ed shared a similar use case.