I am facing high ms per request when using OpenAI embeddings typesense #community-help

I am facing high ms per request when using OpenAI ...

Isaac Kim

07/03/2024, 5:43 AM

I am facing high ms per request when using OpenAI embeddings. Is this because it is forwarding the search to OpenAI to create a vector, then using that vector to perform the search? Or is this because OpenAI's embeddings have a much larger size? When using

ts/all-MiniLM-L12-v2

, processingTimeMS is ~20ms, and when using

openai/text-embedding-3-small

, processingTimeMS is ~450-500ms. Also open to discussion about what is a good local model to use!

Open in Slack

Previous Next