Understanding Score Calculation in Hybrid Search
TLDR Narayan questioned how scores in hybrid search are calculated, Kishore Nallan explained that the number of vectors fetched equals min(k, limit)
.
Oct 04, 2023 (2 months ago)
Narayan
12:26 AMlimit
for the query and k
for the vector search - does limit
influence k
in any way like set to min(k, limit)
for any optimization? Or is limit
is used for getting keyword search results and k
is used for vector
search results, followed by combining them and then finally taking the top limit
results? I would assume that if I increase k
say from 5 to 10 the results in the second option with k
=10 should contain all the results in the first option with k
=5. But I see otherwise. This happens in some cases in my very large dataset and not always and I have failed to create a minimal reproducible example.Kishore Nallan
11:59 AMmin(k, limit)
will be the actual number of vectors fetchedTypesense
Indexed 3015 threads (79% resolved)
Similar Threads
Discussion on Calculating Vector Distance in Hybrid Search Results
Narayan raised concerns on the `vector_distance` results of keyword hits in hybrid search, suggesting it could be manually calculated. Jason explained the algorithm's design, where `vector_distance` does not apply to keyword searches. They agreed on future considerations for scenarios relying on vector distance.
Hybrid Search Distance Threshold Issue
Anish has an issue with search results not respecting the vector distance threshold when using hybrid search. Jason explains additional fields cause `vector_distance` to only apply to vector search results and suggests opening a feature request on GitHub.
Integrating Semantic Search with Typesense
Krish wants to integrate a semantic search functionality with typesense but struggles with the limitations. Kishore Nallan provides resources, clarifications and workarounds to the raised issues.