#community-help

Extracting Typos & Dropped Tokens Info from Search Results

TLDR Dima wants to extract info about applied typos/dropped tokens from search results. Jason offers suggestions and encourages creating a GitHub issue for more structured information. Dima creates the issue.

Powered by Struct AI

3

face_with_peeking_eye

1

1

Jun 14, 2023 (3 months ago)
Dima
Photo of md5-1b62114a658b760944aa7d2b4c274460
Dima
08:07 PM
And another question: how can I extract info about applied typos / dropped tokens in the results? We want to collect this information in query analytics + maybe show user some notice about possible irrelevant results
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:10 PM
The search results don’t give additional information about typos / dropped tokens at the moment, but one way you could deduce this information is using the matched_tokens key in the response and compare it to the query that the user typed
08:11
Jason
08:11 PM
matched_tokens is also an array, so if you count the number of items in the array and compare it to the number of words in the query, you’ll be able to tell if drop tokens kicked in
Dima
Photo of md5-1b62114a658b760944aa7d2b4c274460
Dima
08:12 PM
What about prefix search? I thought matched_tokens may contain more elements than query tokens
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:13 PM
You’re right, the last word in the query will do a prefix search
08:14
Jason
08:14 PM
So you’d have to consider partial substring matches just for the last word

1

face_with_peeking_eye

1

08:14
Jason
08:14 PM
In any case, I think it would be good to expose this information in a more structured way in the response.
08:15
Jason
08:15 PM
Could you create another GitHub issue for this, describing the ideal type of information (the JSON structure) that would help to power your use-case?

1

Dima
Photo of md5-1b62114a658b760944aa7d2b4c274460
Dima
08:15 PM
Of course 🤓

1