John
05/27/2022, 12:22 PMnum_typos=2
we get that earrings
matches arvin
even though the edit distance is 4, but earrin
and arvin
has edit distance of 2. Not sure it’s dropping, it’s just my best guess but it seems like strange behaviour to me. It doesn’t happen with prefix: false
Kishore Nallan
05/27/2022, 1:00 PMJohn
05/27/2022, 1:01 PM0.22.2
Kishore Nallan
05/27/2022, 1:10 PM?q=arvin&query_by=title
I get no results. Can you provide a reproduceable snippet?John
05/27/2022, 1:17 PMimport typesense
COLLECTION = "example"
client = typesense.Client(
{
"api_key": "TYPESENSEDEV",
"nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}],
"connection_timeout_seconds": 2,
}
)
client.collections.create(
{
"name": COLLECTION,
"fields": [
{"name": "title", "type": "string"},
{"name": "brand", "type": "string"},
],
}
)
client.collections[COLLECTION].documents.create(
{"id": "1", "title": "daylight earrings gold plated", "brand": "foo"}
)
client.collections[COLLECTION].documents.create(
{"id": "2", "title": "something else", "brand": "arvin"}
)
result = client.collections[COLLECTION].documents.search(
{
"q": "earrings",
"query_by": "title,brand",
"use_cache": False,
"num_typos": "2,2",
}
)
print(result["hits"])
Kishore Nallan
05/27/2022, 1:47 PMJohn
05/27/2022, 1:48 PMKishore Nallan
05/27/2022, 1:49 PMJohn
05/30/2022, 8:02 AMearrings
to earring
even though it’s just 1 typo, example:
import typesense
COLLECTION = "example"
client = typesense.Client(
{
"api_key": "TYPESENSEDEV",
"nodes": [{"host": "localhost", "port": "8108", "protocol": "http"}],
"connection_timeout_seconds": 2,
}
)
client.collections[COLLECTION].delete()
client.collections.create(
{
"name": COLLECTION,
"fields": [
{"name": "title", "type": "string"},
{"name": "brand", "type": "string"},
],
}
)
client.collections[COLLECTION].documents.create(
{"id": "1", "title": "daylight earrings gold plated", "brand": "foo"}
)
client.collections[COLLECTION].documents.create(
{"id": "2", "title": "fancy earring", "brand": "foo"}
)
client.collections[COLLECTION].documents.create(
{"id": "3", "title": "something else", "brand": "arvin"}
)
result = client.collections[COLLECTION].documents.search(
{
"q": "earrings",
"query_by": "title,brand",
"use_cache": False,
"num_typos": "2,2",
}
)
print(result["hits"])
just gives the document with earrings
Kishore Nallan
05/30/2022, 8:08 AMnum_typos
parameter is basically a maximum value of typos allowed. Since there is already a record with exact match, other typos are not considered. This behavior can be tweaked with the typo_tokens_threshold
parameter. This parameter controls the minimum number of results that should be fetched before typo relaxation is stopped. Since the default is 1, Typesense does not look for words with more typos when it finds atleast a document with exact match.John
05/30/2022, 8:08 AMJohn
05/30/2022, 8:26 AMtypo_tokens_threshold=50
and num_typos=2
I still get arvin
as a result when querying for earrings
. With num_typos=1
I don’t get it. I think that it should only show up if num_typos=4
. It still only happens with prefix=True
.
This is on 0.23.0.rc70Kishore Nallan
05/30/2022, 8:58 AMKishore Nallan
05/31/2022, 5:32 AMJohn
05/31/2022, 7:27 AMJohn
06/07/2022, 6:13 AMKishore Nallan
06/07/2022, 6:20 AMKishore Nallan
06/07/2022, 6:28 AMtypesense/typesense:0.24.0.rc2
to Docker that contains this fix.John
06/07/2022, 6:49 AMJohn
06/07/2022, 7:40 AMKishore Nallan
06/07/2022, 7:40 AM