#community-help

Troubleshooting Typo Highlighting in Search Queries

TLDR Stefan queried "chews" and "Roche", despite having a typo distance of 4, was highlighted. Kishore Nallan requested to test on v0.20 RC. Random results like "Sachets" and "Lachesca" were also highlighted. On single record query, highlighting works. Kishore Nallan assured to address this issue in the pending release.

Powered by Struct AI

1

Apr 22, 2021 (31 months ago)
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
11:50 AM
Also this is strange, I put num_typos to 2 and Roche is considered a match using the query "chews" (i checked distance is 4)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:56 AM
:thinking_face: Is "Roche" highlighted? Also, please try on v0.20 RC to see if the same behavior exists.
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
11:56 AM
yes roche was highlighted
11:56
Stefan
11:56 AM
and no other term is close
11:56
Stefan
11:56 AM
11:56
Stefan
11:56 AM
but I will check against v20
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:57 AM
Thanks.
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:22 PM
The following improved: matches that contain the query are listed first, but I still get a lot of random results
12:22
Stefan
12:22 PM
e.g. Sachets is matched, Lachesca
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:24 PM
Can you please create a test collection and index just that record and repeat the query and see if the record is returned?
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:29 PM
yes it does
12:29
Stefan
12:29 PM
from typing import List

from external_services.typesense_cloud import typesense_client

from database_connection import db
import models as m

schema = {
    'name':
        'test',
    'fields':
        [
            {
                'facet': False,
                'name': 'id',
                'optional': False,
                'type': 'string'
            },
            {
                'facet': False,
                'name': 'name',
                'optional': False,
                'type': 'string'
            },
        ],
}


def bootstrap_data():
    # typesense_client.collections['test'].delete()
    typesense_client.collections.create(schema)

    products: List[m.Product] = db.session.query(m.Product).all()


    documents = [{"id": "1", "name": "Roche"}]
    typesense_client.collections['test'].documents.import_(documents, {'action': 'upsert', 'batch_size': 100})
    # typesense_client.collections['ingredients_v2'].documents.import_(documents, {'action': 'create'})


# bootstrap_data()

# res = typesense_client.collections['test'].retrieve()
# pprint(res)
search_requests = {
    'searches': [{
        'collection': 'test',
        'q': 'chews',
    }, ]
}

res = typesense_client.multi_search.perform(search_requests, {
    'query_by': 'name',
})

print(res)

12:37
Stefan
12:37 PM
maybe I am running the wrong version (typesense/typesense:0.20.0.rc44) but the highlighting issue seems to still be there as well
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:38 PM
Which highlighting issue?
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:39 PM
that some words are not highlighted
12:39
Stefan
12:39 PM
• Ensure that all queried fields are highlighted in search response
12:40
Stefan
12:40 PM
12:40
Stefan
12:40 PM
query: la roche posay cicaplast
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:41 PM
Yes that's still a work in progress. Which is why we haven't yet released 0.20 yet. Some last mile issues left to be addressed.
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:41 PM
ah okay cool, sorry thought it's done already
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:42 PM
No worries, if you can help me do a quick test: can you again index just this one record where highlight fails and tell me if the same issue occurs when querying for a single record?
12:43
Kishore Nallan
12:43 PM
This will greatly help me frame the test cases which I can then directly fix. In case this turns out to be another edge case different from what I've noticed before.
Stefan
Photo of md5-d6c265b4792dbf0a1d6ae378f39d8736
Stefan
12:48 PM
not sure if I got it correctly, but I deleted the collection, reindex just this one product
12:48
Stefan
12:48 PM
if I do that: Highlighting works
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:03 PM
Yeah so it only fails when queried with other records. Got it. I will be looking into this for this release.

1