Preparing Data for Indexing in TypeSense
TLDR Charles sought advice for preparing diverse data for TypeSense indexing. Jason suggested creating documents for similar fields and extracting text from PDF into paragraphs.
Jun 21, 2022 (16 months ago)
Charles
03:53 PMJason
03:56 PMCharles
03:56 PMJason
03:57 PMCharles
03:59 PMCharles
03:59 PMCharles
03:59 PMCharles
04:00 PMJason
04:00 PMCharles
04:00 PMJason
04:01 PMTypesense
Indexed 2779 threads (79% resolved)
Similar Threads
Troubleshooting Typesense Document Import Error
Christopher had trouble importing 2.1M documents into Typesense due to memory errors. Jason clarified the system requirements, explaining the correlation between RAM and dataset size, and ways to tackle the issue. They both also discussed database-like query options.
Using Typesense to Index Large Amounts of Data
Rafael wants to use Typesense to index 100M documents currently in MongoDBAtlas. Jason affirmed Typesense can handle it and asked for more details.
Deleting and Creating Documents in Typesense
Priyank asked for help with deleting and creating documents in Typesense and found the problem to be an issue with their own code. Jason offered support.
Nested Objects and Arrays in Typesense
Robert seeks advice on managing nested objects in Typesense. Kishore Nallan informs of upcoming support for nested objects and provides a current workaround. Robert indicates interest in tracking this feature. Kishore Nallan provides a link to follow the issue.
Discussing Document Indexing Speeds and Typesense Features
Thomas asks about the speed of indexing and associated factors. The conversation reveals that larger batch sizes and NVMe disk usage can improve speed, but the index size is limited by RAM. Jason shares plans on supporting nested fields, and they explore a solution for products in multiple categories and catalogs.