#community-help

Prefix Matching Issues in Typesense

TLDR Toby has an issue with prefix matching where text-matches are inconsistently ordered. Jason suggested opening a Github issue for this bug, which Toby did.

Powered by Struct AI
Aug 16, 2021 (29 months ago)
Toby
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Toby
04:44 PM
Hi everyone!

I’m having some trouble getting prefix matches to work as I expect; with a small corpus of 10 documents with names starting with John, a prefix search for John W correctly shows John Williams with a text_match higher than the other Johns. But if you start the second word with the same letter that the first word starts with, i.e. John J, the text_match is the same for all results and therefore the order isn’t what you’d expect:

curl "" \
       -X POST \
       -H "Content-Type: application/json" \
       -H "X-TYPESENSE-API-KEY: xyz" \
       -d '{
         "name": "johns",
         "fields": [
           {"name": "name", "type": "string" }
         ]
       }'

curl "" -X POST \
        -H "Content-Type: application/json" \
        -H "X-TYPESENSE-API-KEY: xyz" \
        -d '{ "id": "1", "name": "John Stark" }
        { "id": "1", "name": "John Atwood" }
        { "id": "2", "name": "John Smith" }
        { "id": "3", "name": "John Johnson" }
        { "id": "4", "name": "John Williams" }
        { "id": "5", "name": "John Brown" }
        { "id": "6", "name": "John Jones" }
        { "id": "7", "name": "John Garcia" }
        { "id": "8", "name": "John Miller" }
        { "id": "9", "name": "John Keller" }
        { "id": "10", "name": "John Davis" }'```
curl -H "X-TYPESENSE-API-KEY: xyz" \
"http://localhost:8108/collections/johns/documents/search\
?q=John%20W&query_by=name&per_page=3"

{"facet_counts":[],"found":11,"hits":[{"document":{"id":"4","name":"John Williams"},"highlights":[{"field":"name","matched_tokens":["John","Williams"],"snippet":"<mark>John</mark> <mark>Williams</mark>"}],"text_match":50225924},{"document":{"id":"10","name":"John Davis"},"highlights":[{"field":"name","matched_tokens":["John"],"snippet":"<mark>John</mark> Davis"}],"text_match":33514496},{"document":{"id":"9","name":"John Keller"},"highlights":[{"field":"name","matched_tokens":["John"],"snippet":"<mark>John</mark> Keller"}],"text_match":33514496}],"out_of":11,"page":1,"request_params":{"collection_name":"johns","per_page":3,"q":"John W"},"search_time_ms":8}

curl -H "X-TYPESENSE-API-KEY: xyz" \
"http://localhost:8108/collections/johns/documents/search\
?q=John%20J&query_by=name&per_page=3"

{"facet_counts":[],"found":11,"hits":[{"document":{"id":"10","name":"John Davis"},"highlights":[{"field":"name","matched_tokens":["John"],"snippet":"<mark>John</mark> Davis"}],"text_match":50226176},{"document":{"id":"9","name":"John Keller"},"highlights":[{"field":"name","matched_tokens":["John"],"snippet":"<mark>John</mark> Keller"}],"text_match":50226176},{"document":{"id":"8","name":"John Miller"},"highlights":[{"field":"name","matched_tokens":["John"],"snippet":"<mark>John</mark> Miller"}],"text_match":50226176}],"out_of":11,"page":1,"request_params":{"collection_name":"johns","per_page":3,"q":"John J"},"search_time_ms":14}```
Any ideas? Thanks!
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:06 PM
Toby This sounds like a bug. Mind opening an issue on Github so we can track it?

Really appreciate the default minimal reproduceable test case! 🙏
Toby
Photo of md5-ac5d64b63e48fd1a3cf936c3e2221a2c
Toby
07:47 PM
Jason Thanks for the quick response. Sure thing: https://github.com/typesense/typesense/issues/348
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:27 PM
Awesome, will keep you posted via Github