Btw since Typesense allows Hybrid search and also to use cus typesense #community-help

Btw, since Typesense allows Hybrid search and also...

Said

12/17/2024, 8:09 AM

Btw, since Typesense allows Hybrid search and also to use custom embedding models, is it possible to use the hybrid search funcitonality but with 2 embedding fields? One for Dense Embeddings (e.g. AzureOpenAIs text-embedding-large3) and one for Sparse Embeddings?

Fanis Tharropoulos

12/17/2024, 8:13 AM

You'd add the additional embedding field to

query_by

(and exclude it from the resulting values afterwards)

Said

12/17/2024, 10:21 AM

Not Sure how that could Work Out, as the query would need to be embedded once as Dense vector and once as sparse vector. Would I then simply give it two queries? How will the system them know which query to use over which field? Also, how will the search result of both vectorsearches be combined with the Text-search?

Fanis Tharropoulos

12/17/2024, 1:57 PM

Don't you have two separate embedding fields, one for the sparse and one for the dense vector? What does your schema look like?

Jason Bosco

12/17/2024, 4:09 PM

While you can have many embedding fields in your documents, you can only use one of them in query_by at a time, for a given search query

Said

12/17/2024, 4:25 PM

Nope, just an Idea based on my experiences with different RAG approaches. But I wanted to conduct search over the fields "Doc_Titel, Chunk, Chunks(Dense embedding), chunk(sparse embedding), doc_summary and so on (+some Filters before the search based on Date,...) Since I wanted two different embeddings, the search from the embeddings should have been somehow combined using some scoring function and custom weighting. Afterwards the results should then be combined with the text search and reranked using a reranker. Still thx for your responsea, but do you maybe have some workaround for that usecase?

Kishore Nallan

12/18/2024, 1:32 AM

We don't have a way to query multiple embeddings merge the results the way you've described. Given the memory & latency involved in vector search, this approach is generally not very scalable. I think a simpler idea would be to have a single text representation of your various fields and fine-tune the embedding model that you use on expected ranking.

3 Views

Open in Slack

Previous Next