Hi team I m currently testing built in embedding feature fro typesense #community-help

Hi team! I’m currently testing built-in embedding ...

Dima

07/18/2023, 8:41 AM

Hi team! I’m currently testing built-in embedding feature from the alpha doc. Do you support indexing with embedding for updating imports, for instance with

emplace

Kishore Nallan

07/18/2023, 8:42 AM

Yes updates should be handled.

Dima

07/18/2023, 2:04 PM

Great, it’s working now, I did something wrong on the first attempt 🙃 But I faced with another problem: I ran re-indexing and found that while in-build embedding works on indexing it affects search performance. It is totally fine and expected, so I increased available CPU on the node 2 -> 4 -> 8 -> 16, but it’s still not enough. Can I decrease concurrency of embedding / indexing somehow? For now I’m sending 500 rows in one batch, maybe I should decrease this amount?

Jason Bosco

07/18/2023, 3:30 PM

Yeah reducing the number of documents you send in an import API call, and pacing them out would be the way to reduce indexing concurrency

Jason Bosco

07/18/2023, 3:32 PM

On a side note, if you have over say 100K documents, I would recommend running the embedding on a GPU, which speeds up embedding generation for local models significantly

Jason Bosco

07/18/2023, 3:33 PM

GPU support is already in the latest build, but we need to put together docs on how to set it up, since it unfortunately involves setting external dependencies like CUDA (too large to bundle within Typesense)

Dima

07/18/2023, 3:39 PM

Got it, thank you!

Dima

07/18/2023, 5:04 PM

And one last question — does built-in embedding save into snapshot? Or all content have to be reindexed and embedded after every restart?

Jason Bosco

07/18/2023, 5:05 PM

Once the embeddings are generated, they are indeed saved into snapshot during the next hourly snapshot in the most recent RC builds (earlier builds had a bug that caused embeddings to be regenerated on each restart)

Dima

07/18/2023, 5:28 PM

Ah, maybe it’s the reason. Do you have tar.gz with latest RC at hand or it’s better just wait for a release?

Jason Bosco

07/18/2023, 6:16 PM

I have a DEB here: https://dl.typesense.org/releases/0.25.0.rc53/typesense-server-0.25.0.rc53-amd64.deb

Dima

07/18/2023, 6:17 PM

Extracted binary, thank you

👍 1

Open in Slack

Previous Next