#community-help

Discussion on Snippeting Multiple Matches

TLDR bnfd inquired about highlighting multiple matches with snippeting. Kishore Nallan stated it was possible by listing the fields in highlight_full_fields, but multiple snippets from various parts of a document couldn't be achieved due to Typesense's scoring parameters.

Powered by Struct AI

2

20
25mo
Solved
Join the chat
Oct 29, 2021 (25 months ago)
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:02 PM
Kishore Nallan I saw the issue regarding exhaustive highlighting and wanted to ask if there was any update regarding multiple matches snippeting: https://typesense-community.slack.com/archives/C01P749MET0/p1629833031204400?thread_ts=1629830554.200800&cid=C01P749MET0
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:07 PM
Have you tried 0.22.0.rcs21 build?
01:07
Kishore Nallan
01:07 PM
Oh wait this is about highlighting all instances of a match within a document.

1

01:09
Kishore Nallan
01:09 PM
We made a bunch of highlight improvements in 0.22.. I have to verify if multiple highlights are now possible for the full field.
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:15 PM
I haven't tried the rcs21 build
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:31 PM
bnfd I just checked. If you list the fields in highlight_full_fields then those fields are highlighted fully now on the latest builds.
01:32
Kishore Nallan
01:32 PM
Please try it out on 0.22.0.rcs21 and let me know!
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:33 PM
Ah this is without snippeting, right?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:33 PM
Yes, snippet is meant for highlighting the best matched portion.
01:33
Kishore Nallan
01:33 PM
You want multiple snippets from various parts of the document?
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:34 PM
Yes
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:34 PM
That is not possible because Typesense scores documents on the best matched portion.
01:34
Kishore Nallan
01:34 PM
I mean, if we were to do it, it must be done in a post processing step before display.
01:35
Kishore Nallan
01:35 PM
Atleast now, it is possible to do this on the client side because words will be marked with <mark> tags.
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:36 PM
Ah, I get the relevance aspect but sometimes I want to return all lines containing "foo" from each doc
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:36 PM
You can do it on the client side easily now. Split paragraphs into sentences, and in each sentence check for number of <mark> tags and rank sentences on number of occurrences and show top N.

1

bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:48 PM
I was concerned that doing it on the frontend might result in a nontrivial overhead (in the case of a collection with many long documents) and having it offered by the engine would be beneficial. But maybe this is a rare use case.
01:48
bnfd
01:48 PM
Thanks for digging whether it's possible!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:50 PM
Certainly we can add an option for doing it, but just cautious about introducing too many new flags 🙂
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
01:51 PM
Yeah, makes sense