Hi there, I'm using CLIP for an image search engin...
# community-help
m
Hi there, I'm using CLIP for an image search enginee and I want to tune a proper distance_threshold, but the vector distances are so similar and all in the 0.7 to 0.8 range. Do you have any advice to how can I find the best distance threshold?
k
This is very difficult to call because it's domain specific. Generally when you have lot of distances in the 0.7-0.8 ranges it means that many of them are actually not that relevant so model is not able to clearly differentiate and rank them more sharply.
m
Thank you, as I searched for some solution, I found that image preprocessing (resizing, normalizing,...) and also normalizing the embeddings may have positive effects? Is there any way to do image preprocessing and embedding normalizing when using typesense built-in CLIP model?
k
You have to do the processing outside.
🙏 1