#community-help

Query on "weighted_score" & Issue with Synonym Highlighting

TLDR Stefan asked about "weighted_score" field and reported a possible synonym highlighting issue. Kishore Nallan clarified the use of "weighted_score". The possible synonym issue is still being investigated.

Powered by Struct AI
Jun 03, 2022 (19 months ago)
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:08 PM
Hi, is there additional documentation on the "weighted_score" field? Where and how do I define it?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:09 PM
I don't follow you. Where do you see this weighted_score?
12:10
Stefan
12:10 PM
Is the weight_score field just part of my schema?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:11 PM
Yes that's just a field on your schema
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:11 PM
Ah okay, got it, thank you
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:12 PM
What this feature does is to basically not treat text match score as a strict ordering criteria. By "bucketing" the score you mix up the text match score so that similar text matches are deemed the same and are hence ranked only by the custom weight score field's value.
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:15 PM
Ah okay, then that is not what I am looking for.
Could it be that there is a "bug" when you define a synonym that a match through a synonym is not taking into account fully?

For example this result:
          {'document': {'brand': "Paula's Choice",
                        'key_ingredients_flat': ['ASCORBIC ACID',
                       'ACETYL OCTAPEPTIDE-3',             'FERULIC ACID',                                  'PANTHENOL',                                   'GLYCERIN',                                   'SODIUM HYALURONATE',                          'TOCOPHEROL'],
                        'name': 'Resist C15 Super Booster'},
           'highlights': [{'field': 'name',
                           'matched_tokens': ['C1'],
                           'snippet': 'Resist <mark>C1</mark>5 Super Booster'}],
           'text_match': 282583051272194},
12:15
Stefan
12:15 PM
for the query "paulschoice C1"
with these synonsmys
{'id': 'brand-Paulaschoice-synonyms', 'root': '', 'synonyms': ['paulas choice', 'paulas choiceq', 'paulas choice', 'paulas choice', 'paulaschoice']}
12:16
Stefan
12:16 PM
I would expect paulas choice to be in the highlighted fields
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:17 PM
Is the issue around highlighting or the result not even showing up despite having a synonym defined for it?
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:18 PM
The result shows up, but not on position one (which I would have expected). This is what lands on position 1:

{'document': {'brand': '',
                        'key_ingredients_flat': ['SALICYLIC ACID',
                                                 'LACTIC ACID',
                                                 'SODIUM ASCORBYL PHOSPHATE',
                                                 'GLYCERYL STEARATE SE',
                                                 'CETYL ALCOHOL',
                                                 'PRUNUS AMYGDALUS DULCIS OIL',
                                                 'PRUNUS ARMENIACA KERNEL OIL',
                                                 'PERSEA GRATISSIMA OIL',
                                                 'COCOS NUCIFERA OIL',
                                                 'GLYCERIN'],
                        'name': 'VITAMIN C CLEANSER C1'},
           'highlights': [{'field': 'name',
                           'matched_tokens': ['C1'],
                           'snippet': 'VITAMIN C CLEANSER <mark>C1</mark>'}],
           'text_match': 282583068049410},
12:18
Stefan
12:18 PM
query:
    results = client.collections[PUBLIC_SCHEMA_NAME].documents.search({
        'q':                      "paulaschoice c1",
        'query_by':               'name,brand,key_ingredients_flat',
        "include_fields":         'name,brand,key_ingredients_flat',
        "prioritize_exact_match": False,
        "query_by_weights":       '2,2,0'
    })
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:18 PM
On v0.22 or v0.23 RC?
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:19 PM
0.73
12:19
Stefan
12:19 PM
*rc0.23.rc73
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:19 PM
Ok can you DM me your cluster ID? I will take a look and get back to you.

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Phrase Search Relevancy and Weights Fix

Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.

6

111
8mo

Docusaurus Integration - Search Results Ordering & Missing Highlight

Abhishek reported issues with search result ordering and missing highlights in docusaurus integration. Jason explained the result differences between modal and search page. Abhishek submitted a fix for the highlight issue.

13
9mo

Issue with Embedding Error in Version 0.25.0.rc63

Bill reported a bug in version 0.25.0.rc63 regarding a problem with updating or emplacing a document and receiving an embedding error. This was resolved in version 0.25.0.rc65, but further discussion ensued regarding the function of 'index' in the update feature.

5

63
4mo

Troubleshooting Typo Highlighting in Search Queries

Stefan queried "chews" and "Roche", despite having a typo distance of 4, was highlighted. Kishore Nallan requested to test on v0.20 RC. Random results like "Sachets" and "Lachesca" were also highlighted. On single record query, highlighting works. Kishore Nallan assured to address this issue in the pending release.

1

25
34mo

Querying and Indexing Multiple Elements Issues

Krish queried fields with multiple elements, which Kishore Nallan suggested checking `drop_tokens_threshold`. Krish wished to force OR mode for token, but Kishore Nallan admitted the feature was missing. Krish was able to resolve the issue with url encoding.

34
12mo