Hi team! I’m currently testing built-in embedding ...
# community-help
d
Hi team! I’m currently testing built-in embedding feature from the alpha doc. Do you support indexing with embedding for updating imports, for instance with
emplace
?
k
Yes updates should be handled.
d
Great, it’s working now, I did something wrong on the first attempt 🙃 But I faced with another problem: I ran re-indexing and found that while in-build embedding works on indexing it affects search performance. It is totally fine and expected, so I increased available CPU on the node 2 -> 4 -> 8 -> 16, but it’s still not enough. Can I decrease concurrency of embedding / indexing somehow? For now I’m sending 500 rows in one batch, maybe I should decrease this amount?
j
Yeah reducing the number of documents you send in an import API call, and pacing them out would be the way to reduce indexing concurrency
On a side note, if you have over say 100K documents, I would recommend running the embedding on a GPU, which speeds up embedding generation for local models significantly
GPU support is already in the latest build, but we need to put together docs on how to set it up, since it unfortunately involves setting external dependencies like CUDA (too large to bundle within Typesense)
d
Got it, thank you!
And one last question — does built-in embedding save into snapshot? Or all content have to be reindexed and embedded after every restart?
j
Once the embeddings are generated, they are indeed saved into snapshot during the next hourly snapshot in the most recent RC builds (earlier builds had a bug that caused embeddings to be regenerated on each restart)
d
Ah, maybe it’s the reason. Do you have tar.gz with latest RC at hand or it’s better just wait for a release?
d
Extracted binary, thank you
👍 1