Hi I want to use Google's `text-embedding-005` mod...
# community-help
d
Hi I want to use Google's
text-embedding-005
model, but the docs aren't super clear on how Typesense handles embedding documents (or I missed it): • I want to create 256 dimensions embeddings... is setting
num_dim
enough? • Google has this concept of Task Types - is that used? Using
RETRIEVAL_DOCUMENT
and
RETRIEVAL_QUERY
might be optimal...not 100% sure. • I write to this collection daily, but the fields I want to embed don't change that often, does Typesense only update the embedding if an
embed.from
field changes? Or does a write event trigger a re-computation regardless? • I have a collection with ~800k documents... is it worth trying a batch size above 200? Not sure if I'll hit rate limits or anything. • One of the fields I want to embed can be very long - do you truncate it to a certain max length? • How do you preprocess and format/order the embedding if there's multiple
embed.from
fields (and arrays etc).?