Understanding Score Calculation in Hybrid Search
TLDR Narayan questioned how scores in hybrid search are calculated, Kishore Nallan explained that the number of vectors fetched equals
Oct 04, 2023 (2 months ago)
limitfor the query and
kfor the vector search - does
kin any way like set to
min(k, limit)for any optimization? Or is
limitis used for getting keyword search results and
kis used for
vectorsearch results, followed by combining them and then finally taking the top
limitresults? I would assume that if I increase
ksay from 5 to 10 the results in the second option with
k=10 should contain all the results in the first option with
k=5. But I see otherwise. This happens in some cases in my very large dataset and not always and I have failed to create a minimal reproducible example.
Kishore Nallan11:59 AM
min(k, limit)will be the actual number of vectors fetched
Indexed 3015 threads (79% resolved)
Discussion on Calculating Vector Distance in Hybrid Search Results
Narayan raised concerns on the `vector_distance` results of keyword hits in hybrid search, suggesting it could be manually calculated. Jason explained the algorithm's design, where `vector_distance` does not apply to keyword searches. They agreed on future considerations for scenarios relying on vector distance.
Hybrid Search Distance Threshold Issue
Anish has an issue with search results not respecting the vector distance threshold when using hybrid search. Jason explains additional fields cause `vector_distance` to only apply to vector search results and suggests opening a feature request on GitHub.
Integrating Semantic Search with Typesense
Krish wants to integrate a semantic search functionality with typesense but struggles with the limitations. Kishore Nallan provides resources, clarifications and workarounds to the raised issues.