Full Text Search Across Long Books: Chunking vs. Single Document
TLDR Epi asked about providing FTS for long books. Kishore Nallan suggested breaking the books into chunks for better performance and query results.
Powered by Struct AI

2
9mo
Solved
Dec 03, 2022 (9 months ago)
Epi
Epi
08:03 PMIf I'd like to provide FTS across a collection of long books, is it necessary to break each book into chunks for performance, or can a single document be the whole book itself?
Dec 04, 2022 (9 months ago)
Kishore Nallan
Kishore Nallan
12:48 AMDefinitely you need to break it into chunks because that's when you can provide multiple relevant highlights for a given query. You can always use group by to group the results meaningfully at a per chapter or page level if needed.

Typesense
Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI
Indexed 2764 threads (79% resolved)
Similar Threads
Discussing Large Document Indexing in Word Files
robert asked about indexing large word files. Kishore Nallan advised splitting into smaller documents for improved performance.

2
13mo
Solved
Optimal Indexing and Querying of Large Documents
Robert asks about the best practice for indexing large documents and the ideal size of subdocuments. Jason suggests experimenting with 10K words in a single document and performance testing.
4
2w
Solved
Addressing TS Cloud Highlight Issues
Orion expressed concerns about TS Cloud's highlight handling in large documents. Kishore Nallan suggested a workaround by segmenting long texts into smaller documents.
2
17mo