#community-help

Resolving Typesense Search Issues

TLDR Conversation started by Maximilian about Typesense search behavior led to Users Kishore Nallan and Mike discussing and suggesting workaround, with Kishore Nallan promising an official solution soon. No final confirmation of resolution provided.

Powered by Struct AI
+11
Mar 29, 2022 (18 months ago)
Maximilian
Photo of md5-d866f36ff951bfe0aae996ac8a2a5024
Maximilian
11:11 AM
Hello, I get a strange behaviour. I have a collection of categories that includes the string "smartphones". If I search for "smartphone apple" Typesense finds the right category. If I search for "apple smartphone" it finds something similar to "apple" ("cappe") but not "smartphones". If I set the parameter 'droptokens_threshold': 0, "_apple smartphone" returns 0 results.

Is Typesense searching for every word or by phrase?

Thanks
Mar 30, 2022 (18 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:21 AM
👋 Currently Typesense drops tokens from right to left when it can't find a record with all matching words from the query. That's the reason for this difference. However, I'm currently working on an approach that fixes this. I'll have an update in a week.
Maximilian
Photo of md5-d866f36ff951bfe0aae996ac8a2a5024
Maximilian
08:41 AM
thank you very much!

Another question: I'm working with italian language. Is there a way to manage plurals and give them a bigger score instead of considering them "typos"?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:57 AM
We don't have a way to do that out of the box, but you can pre-process both the records indexed and the query before sending into Typesense.
Maximilian
Photo of md5-d866f36ff951bfe0aae996ac8a2a5024
Maximilian
09:04 AM
Using synonyms during import or removing the last letter from the words during search, which would be better?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:05 AM
Latter, synonyms not suitable for too many words.
Maximilian
Photo of md5-d866f36ff951bfe0aae996ac8a2a5024
Maximilian
09:08 AM
ok thanks
Mike
Photo of md5-18fd3472f281af81e2fac5d44861028a
Mike
11:26 PM
We are having a similar issue. We are trying to do a search of multiple fields. If we type a single word it works as expected; if the token is found in any field, then it returns the proper documents. But if you type in TWO tokens, Typesense assumes those tokens will be in the same field. If it can’t find them in the same field it then breaks them apart (drops tokens) and treats it as an “OR” search, and returns all documents that contain token 1 OR token 2. We want it to return all documents that contain token 1 AND token 2, regardless of whether they are in the same field.

Let me know if you find some workaround. Thanks!
Mar 31, 2022 (18 months ago)
Andrew
Photo of md5-88d88db4789daa0e3abef8c3ca27772b
Andrew
01:36 PM
Hi Kishore Nallan is there a GitHub issue for this fix you mentioned?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
01:38 PM
We will be posting updates on https://github.com/typesense/typesense/issues/562

Still working on this.
01:40
Kishore Nallan
01:40 PM
Mike Will have something to show next week for this exact issue.
+11
Mike
Photo of md5-18fd3472f281af81e2fac5d44861028a
Mike
02:05 PM
Kishore Nallan Great to hear. As a workaround we just added a single string with ALL fields. This generates expected results. The worry is that the amalgamated string could get very large if we start adding all desired fields.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:21 PM
Yes, this is not scalable in the long run, which is why we are tackling this so it's fixed once and for all.
Apr 14, 2022 (17 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:14 AM
Mike

I've published a new Docker build 0.23.0.rc53 that address the cross field matching issue that you noticed. Can you please try that out and let me know?