Hello everyone, I am just starting with TypeSense ...
# community-help
c
Hello everyone, I am just starting with TypeSense I was wondering how I should prepare the data for the indexing, I have many with different layout... How would you proceed? thanks in advance
j
In general, you'd typically want to put all documents with the same fields and attribute types in one collection. Similar to what you'd do in a relational database. A table is roughly equivalent to a collection
c
Let's imagine I have books in PDF, how would you proceed?
j
You want to extract the text out of the PDF into paragraphs (there are libraries that do this for you), then create one document per paragraph in Typesense to get the most relevance
c
let's imagine this as an example:
message has been deleted
and many pages like this
you would create blocks (heading+paragraphs) together
j
Yup
c
then here we have several paragraphes, would you ut the same heading for several paragraphes?
j
If you need to show the paragraph heading next to each search result, yes.