I am facing high ms per request when using OpenAI ...
# community-help
i
I am facing high ms per request when using OpenAI embeddings. Is this because it is forwarding the search to OpenAI to create a vector, then using that vector to perform the search? Or is this because OpenAI's embeddings have a much larger size? When using
ts/all-MiniLM-L12-v2
, processingTimeMS is ~20ms, and when using
openai/text-embedding-3-small
, processingTimeMS is ~450-500ms. Also open to discussion about what is a good local model to use!