Vector Search Filter and Cosine Similarity in Typesense
TLDR LT asked about vector search result filtering and cosine similarity. Kishore Nallan explained how cosine distance is related to similarity and shared plans to add a threshold restriction option.
1
Mar 25, 2023 (6 months ago)
LT
10:09 PMYou answered then "Nope that's not possible but these distances don't carry any semantic absolute meaning. They are only useful as relative values."
But just for my udneerstanding: The vector distance is the cosine similarity between the requested embedding and the on in typesense right? Because if I want to store face emebddings and the cosine similarity indicates how similar they are, the distance is very meaningful.
Are there plans to add filter_by functionality for vector distance or is the "go-to-way" for the next years to postprocess the results?
LT
10:51 PMMar 26, 2023 (6 months ago)
Kishore Nallan
01:49 PMcosine_distance = 1 - cosine_similarity
When 2 vectors are exactly same, the cosine similarity be 1, so the cosine distance will be 0.
Likewise, when 2 vectors are very different then the cosine similarity will be -1 so the cosine distance will be 2.
We plan to add a way to restrict results based on a threshold. When I meant "don't carry any semantic absolute meaning" I meant generically across datasets. It's still useful to have a cutoff threshold for some datasets so we will be adding an option for that.
1
Typesense
Indexed 2779 threads (79% resolved)
Similar Threads
Integrating Semantic Search with Typesense
Krish wants to integrate a semantic search functionality with typesense but struggles with the limitations. Kishore Nallan provides resources, clarifications and workarounds to the raised issues.
Typesense Product Search Using Euclidean Distance Between Vectors
mape asks about searching products in Typesense using Euclidean distance between vectors. Jason indicates this isn't yet supported and asks for an example from mape's use case.
Planned Integration of Vector Search
Stefan asked about vector search's inclusion in the specifications. Kishore Nallan confirmed it to be considered and further developing it to support a `vector[X]` data type for nearest neighbor search.