Troubleshooting Typo Correction in Typesense Search
TLDR John encountered issues with the typo costs while executing prefix searches in Typesense. Kishore Nallan tracked and resolved the problem, providing John with an updated build to verify the fix.
2
May 27, 2022 (19 months ago)
John
12:22 PMnum_typos=2
we get that earrings
matches arvin
even though the edit distance is 4, but earrin
and arvin
has edit distance of 2. Not sure it’s dropping, it’s just my best guess but it seems like strange behaviour to me. It doesn’t happen with prefix: false
Kishore Nallan
01:00 PMJohn
01:01 PM0.22.2
Kishore Nallan
01:10 PM?q=arvin&query_by=title
I get no results. Can you provide a reproduceable snippet?John
01:17 PMimport typesense
COLLECTION = "example"
client = typesense.Client(
{
"api_key": "TYPESENSEDEV",
"nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}],
"connection_timeout_seconds": 2,
}
)
client.collections.create(
{
"name": COLLECTION,
"fields": [
{"name": "title", "type": "string"},
{"name": "brand", "type": "string"},
],
}
)
client.collections[COLLECTION].documents.create(
{"id": "1", "title": "daylight earrings gold plated", "brand": "foo"}
)
client.collections[COLLECTION].documents.create(
{"id": "2", "title": "something else", "brand": "arvin"}
)
result = client.collections[COLLECTION].documents.search(
{
"q": "earrings",
"query_by": "title,brand",
"use_cache": False,
"num_typos": "2,2",
}
)
print(result["hits"])
Kishore Nallan
01:47 PMJohn
01:48 PMKishore Nallan
01:49 PMMay 30, 2022 (19 months ago)
John
08:02 AMearrings
to earring
even though it’s just 1 typo, example:import typesense
COLLECTION = "example"
client = typesense.Client(
{
"api_key": "TYPESENSEDEV",
"nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}],
"connection_timeout_seconds": 2,
}
)
client.collections[COLLECTION].delete()
client.collections.create(
{
"name": COLLECTION,
"fields": [
{"name": "title", "type": "string"},
{"name": "brand", "type": "string"},
],
}
)
client.collections[COLLECTION].documents.create(
{"id": "1", "title": "daylight earrings gold plated", "brand": "foo"}
)
client.collections[COLLECTION].documents.create(
{"id": "2", "title": "fancy earring", "brand": "foo"}
)
client.collections[COLLECTION].documents.create(
{"id": "3", "title": "something else", "brand": "arvin"}
)
result = client.collections[COLLECTION].documents.search(
{
"q": "earrings",
"query_by": "title,brand",
"use_cache": False,
"num_typos": "2,2",
}
)
print(result["hits"])
just gives the document with
earrings
Kishore Nallan
08:08 AMnum_typos
parameter is basically a maximum value of typos allowed. Since there is already a record with exact match, other typos are not considered. This behavior can be tweaked with the typo_tokens_threshold
parameter. This parameter controls the minimum number of results that should be fetched before typo relaxation is stopped. Since the default is 1, Typesense does not look for words with more typos when it finds atleast a document with exact match.John
08:08 AM1
John
08:26 AMWith
typo_tokens_threshold=50
and num_typos=2
I still get arvin
as a result when querying for earrings
. With num_typos=1
I don’t get it. I think that it should only show up if num_typos=4
. It still only happens with prefix=True
.This is on 0.23.0.rc70
Kishore Nallan
08:58 AM1
May 31, 2022 (19 months ago)
Kishore Nallan
05:32 AMJohn
07:27 AMJun 07, 2022 (19 months ago)
John
06:13 AMKishore Nallan
06:20 AMKishore Nallan
06:28 AMtypesense/typesense:0.24.0.rc2
to Docker that contains this fix.John
06:49 AMJohn
07:40 AMKishore Nallan
07:40 AMTypesense
Indexed 3015 threads (79% resolved)
Similar Threads
Typesense Search Solution Issues
Rolando faced incorrect search results using Typesense. Kishore Nallan suggested changing typo parameters and upgrading Typesense version. However, undesired results persisted and need further investigation.
Resolving Issues with Infix and Prefix in Query Searches
Daren struggled with searches missing values in production. Jason and Kishore Nallan offered insights and created new features to help solve the problem, which was then tested and deployed by Daren.
Phrase Search Relevancy and Weights Fix
Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.
Issues with Repeated Words and Hyphen Queries in Typesense API
JinW discusses issues with repeated word queries and hyphen-containing queries in Typesense. Kishore Nallan offers possible solutions. During the discussion, Mr seeks advice on `token_separators` and how to send custom headers. Issues remain with repeated word queries.
Issue with `included_fields` Command in Typesense
SamHendley encountered an issue with the `included_fields` command in Typesense versions 0.23.0 and 0.24.0.rc17. Jason helped identify it as a bug in the 0.24.X version, which was later addressed in release 0.24.0.rcn19.