Daniel Martel
02/20/2025, 2:19 AMtext-embedding-005
model, but the docs aren't super clear on how Typesense handles embedding documents (or I missed it):
• I want to create 256 dimensions embeddings... is setting num_dim
enough?
• Google has this concept of Task Types - is that used? Using RETRIEVAL_DOCUMENT
and RETRIEVAL_QUERY
might be optimal...not 100% sure.
• I write to this collection daily, but the fields I want to embed don't change that often, does Typesense only update the embedding if an embed.from
field changes? Or does a write event trigger a re-computation regardless?
• I have a collection with ~800k documents... is it worth trying a batch size above 200? Not sure if I'll hit rate limits or anything.
• One of the fields I want to embed can be very long - do you truncate it to a certain max length?
• How do you preprocess and format/order the embedding if there's multiple embed.from
fields (and arrays etc).?