#community-help

Typo Correction Issue in Typesense v0.24.1

TLDR Yoann encounters mysterious behavior in typo correction for certain query strings. Kishore Nallan will investigate the issue.

Powered by Struct AI

1

Jun 02, 2023 (6 months ago)
Yoann
Photo of md5-2f94b63d050dad6ced4a85316a658c61
Yoann
08:05 AM
Hello ! A question on typo correction, as some behaviours are quite mysterious to me.
I have a document with a field name = "La Bouitte".

With typo params in the search set to default and exhaustive search set to true, the typo correction works differently depending on the position of the typo:
• q="bouite" (one t) --> doc is found
• q="bouittee" (extra e) --> doc is found
• q="boutte" (no i) --> doc is found
• q="boitte" (missing u) --> doc is found
• q="buitte" (missing o) --> doc is not found
• q="ouitte" (missing b) --> doc is not found
• q="ouittee" (missing b, extra e) --> doc is found (I guess because min_len_2typo defaults to 2
So inserting a char earlier in the word seems to be harder to correct, why is that so ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:11 AM
👋 When you mean by doc is not found do you mean to say that you get no results at all or that you get other results?
Jun 05, 2023 (6 months ago)
Yoann
Photo of md5-2f94b63d050dad6ced4a85316a658c61
Yoann
08:05 AM
Hello, yes, no result at all
08:05
Yoann
08:05 AM
we made sure that there were no other docs matching these names
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:45 PM
Would you be able to share a sample dataset where this issue exists?
01:45
Kishore Nallan
01:45 PM
Also please tell me the version of Typesense you are using.
Jun 06, 2023 (6 months ago)
Yoann
Photo of md5-2f94b63d050dad6ced4a85316a658c61
Yoann
12:09 PM
v0.24.1
# Example doc
{
  "id": "16421",
  "name": ["la bouitte"],
  "resort": [419,422]
}

Search Request
{"searches":
  [{
    "query_by":"name",
    "collection":"prod__location",
    "filter_by": "resort:=[419]",
    "q":"buitte",
    "exhaustive_search": true
  }]
}
12:20
Yoann
12:20 PM
Create
curl --location '' \
--header 'Content-Type: application/json' \
--header 'X-TYPESENSE-API-KEY: xyz' \
--data '{
         "name": "test",
         "fields": [
            {"name": "name", "type": "string[]"},
            {"name": "resort", "type": "int32[]", "facet": true}
         ]
       }'

Insert
curl --location '' \
--header 'Content-Type: application/json' \
--header 'X-TYPESENSE-API-KEY: xyz' \
--data '{
    "name": ["la bouitte"],
  "resort": [419,422]
}'

Search
curl --location --globoff '' \
--header 'X-TYPESENSE-API-KEY: xyz'

Response:
{
    "facet_counts": [],
    "found": 0,
    "hits": [],
    "out_of": 1,
    "page": 1,
    "request_params": {
        "collection_name": "test",
        "per_page": 10,
        "q": "buitte"
    },
    "search_cutoff": false,
    "search_time_ms": 0
}
12:23
Yoann
12:23 PM
The search works for buitte l and for la buitte but (as expected, since l has no exact match) not for l buitte
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:23 PM
Thanks, will investigate and get back to you.

1

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3011 threads (79% resolved)

Join Our Community

Similar Threads

Troubleshooting Typo Correction in Typesense Search

John encountered issues with the typo costs while executing prefix searches in Typesense. Kishore Nallan tracked and resolved the problem, providing John with an updated build to verify the fix.

2

21
19mo
Solved

Typesense Search Issue with Prefix Search and Typo Correction

John raised an issue with Typesense search results concerning typo correction and prefix searching. Kishore Nallan explained the behavior based on the system parameters for typo constraints. He later corrected a mistake in documentation brought up by John.

1

5
17mo
Solved

Phrase Search Relevancy and Weights Fix

Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.

6

111
8mo
Solved

Understanding Typo Tolerance in Search Queries

gab sought clarity on typo tolerance settings in search operations, specifically on the discrepancy in document returns when typos are involved. Kishore Nallan explained the "num_typos" and "typo_tokens_threshold" parameters within search queries, and how they dictate typo allowance during searches.

2

13
22mo
Solved

Troubleshooting "drop_tokens_threshold" and Typo Tolerance in Typesense

Joe had issues with "drop_tokens_threshold" = 0 and typo tolerance in Typesense, after which Kishore Nallan provided solutions and clarifications on feature functionality. Their issues with the search result limit and tokens were resolved after discussion and testing.

3

29
26mo
Solved