Hi there I have a document like this title office bag img li typesense #community-help

Hi there! I have a document like this { "titl...

Mohammad Javad Alizadeh

01/29/2025, 12:50 PM

Hi there! I have a document like this { "title":"office bag", "img_link":"some dummy url", "img_links":[ "list of dummy url" ] } document may have one or more images, and I want to do a hybrid search on product title and image embeddings. Can I have a list of embedding for a document? Or I should create separate collection for images embedding and do a join for hybrid search?

Jason Bosco

01/29/2025, 6:25 PM

You can use the CLIP model to generate embeddings for both your image and text together in a single embedding field

Mohammad Javad Alizadeh

01/29/2025, 6:27 PM

Thank you so much, but actually my real problem is how to handle embedding for multiple images

Jason Bosco

01/29/2025, 6:29 PM

I would recommend creating one document per image in Typesense, so you can create one embedding per image per document

Jason Bosco

01/29/2025, 6:29 PM

And then you could potentially use group_by if the product has product IDs for eg

Mohammad Javad Alizadeh

01/29/2025, 6:39 PM

So you are saying it's better to create a separate collection and insert one embedding per image per document, and do a hybrid search on image embedding and title, and finally group_by title or IDs, am I right?

Jason Bosco

01/29/2025, 6:42 PM

That's correct... Although you won't be able to use Typesense's built-in models for hybrid search, given that these are image embeddings. You would have to generate these embeddings with the CLIP model outside of Typesense, and then combine keyword search (

) with the

vector_query

parameter

Mohammad Javad Alizadeh

01/29/2025, 6:44 PM

Thank you so much. Actually I'm doing that already, because I'm using Google Siglip model for it's better accuracy.

👍 1

Mohammad Javad Alizadeh

01/29/2025, 7:33 PM

Hello again Jason, I thought about your recommendation and I realized that this will not solve my problem, let ask it in another word. Imagine I have product like this

Copy code

{
    "title": "فر کننده مو بابلس مک استایلر  MAC STYLER titanium curling iron keratin ",
    "subtitle": "",
    "page_unique": "1885",
    "current_price": "2496000",
    "old_price": "3337000",
    "availability": "instock",
    "category_name": "لوازم برقی",
    "image_link": "<https://oss.sazito.com/apiuploads/offerie/uploads/image/rootimage/5195/8ea5bfd3b9d3f895935993a14f23ddb7.png>",
    "image_links": [
        "<https://oss.sazito.com/apiuploads/offerie/uploads/image/rootimage/5196/16fd3eb11980f587383cde28392beb5b.png>",
        "<https://oss.sazito.com/apiuploads/offerie/uploads/image/rootimage/5203/77894fd9802618351e9cc67fdeb72122.webp>",
        "<https://oss.sazito.com/apiuploads/offerie/uploads/image/rootimage/5197/caa48b74de6db19bf12bf6ccc497f735.png>"
    ],
    "page_url": "<https://offerie.ir/product/فر-کننده-مو-بابلس-مک-استایلر-MAC-STYLER-titanium-curling-iron-keratin>",
    "short_desc": "",
    "guarantee": "",
    "registry": "",
    "spec": {
        "سایز": "۱۹ میل طلایی"
    }
},

and I want to have all the search features like filter, sort, ...

Mohammad Javad Alizadeh

01/29/2025, 7:35 PM

but also I want to do hybrid search on image embedding and title, but the problem is that the product may have multiple images, creating multiple document for this product will cause duplicating all the product data

Jason Bosco

01/29/2025, 11:29 PM

If you group by say

page_url

(or preferably some SKU or product ID field), then facet counts, filtering, etc will produce de-duplicated data

Mohammad Javad Alizadeh

01/30/2025, 10:34 AM

Thank you, but I think I couldn't clarify my question very well, actually I'm looking for something like a sub query in SQL, I want to do my hybrid search in a separate collection and then do other stuff like, filtering, sorting,... in the main collection

Open in Slack

Previous Next