Hi folks, I want to explore a feasibility here wit...
# community-help
l
Hi folks, I want to explore a feasibility here with current TypeSense features, is it possible that: • I have a hypothetical scenario (see attached image), and this is the documents matching this example
Copy code
[
  {
    screenId: "screen_a",
    ui_elements: [
      {
        groupId: "group:button:1",
        class: "button",
      },
      {
        groupId: "group:button:1",
        class: "button",
      },
      {
        groupId: "group:icon:1",
        class: "icon",
      },
    ],
    embedding: [...]
  },
  {
    screenId: "screen_b",
    ui_elements: [
      {
        groupId: "group:button:1",
        class: "button",
      },
      {
        groupId: "group:icon:1",
        class: "icon",
      },
    ],
    embedding: [...]
  },
  {
    screenId: "screen_c",
    ui_elements: [
      {
        groupId: "group:button:1",
        class: "button",
      },
      {
        groupId: "group:button:2",
        class: "button",
      },
      {
        groupId: "group:icon:1",
        class: "icon",
      },
    ],
    embedding: [...]
  },
  {
    screenId: "screen_d",
    ui_elements: [
      {
        groupId: "group:icon:1",
        class: "icon",
      },
    ],
    embedding: [...]
  },
];
Can I search for documents that: • contain both
button
&
icon
• and in my search results,
screen_a
and
screen_b
are de-duplicated to only show 1 (since they both contain the same type of elements) I'm thinking of using
embedding
field as you can see above, to be used to de-duplicate similar screens instead of based on the element groupings, is that possible with TypeSense APIs? 🙏
f
You can group by faceted fields: https://typesense.org/docs/29.0/api/search.html#grouping-parameters How would the embedding work? Is it a set of float arrays, like the Vector Search guide mentions?
l
oh i tried group by faceted fields, but the group by key turns out is not what I wanted though, for example, if i do
group_by=ui_elements.groupId
it shows me the grouping key is the whole array like
["group:button:1","group:button:2","group:icon:1"]
embedding could be a 1024 size float arrays representing that screen image
i was thinking embedding could be used to measure visual similarity of screens to de-dup, but not sure how that can work
f
Grouping by embeddings won't work. One approach that could work well is handling the deduplication on the client side or in your backend after the initial search. You’d run the search in Typesense with the filters for buttons and icons, then in your app, group the results by
groupId
(or however you want to define similarity), and select the best candidate from each group. Maybe based on which screen has more unique UI element types, or by using the
embedding
field for semantic similarity.
🙏 1
l
Thanks Fanis! thanks for the insights