Hello. We are in the process of testing Typesense...
# community-help
m
Hello. We are in the process of testing Typesense. We have created a collection of maybe 3000 documents, maybe 6-8 fields. If we search for “basket” (without quotes), Typesense responds properly by returning ONLY the records that contain BASKET in any of the fields. But if we type “Blue Basket” (without quotes) it will return any record that contains blue OR basket (producing many irrelevant hits). Is there any way to change the OR to an AND? I would imagine it would be the same for “plastic ball”, returning EVERYTHING that has either plastic or ball in it, which could produce a large number of unrelated hits. It’s not really a phrase, and I can’t imagine users putting quotes around it (which I understand is needed for phrase match). And I know we could use facets would help, but I’d think those would be supplements to the search rather than a replacement.
j
@Mike Reno This is most likely happening due to
drop_tokens_threshold
(described here: https://typesense.org/docs/0.22.2/api/documents.html#typo-tolerance-parameters) Could you try setting this value to 0?
m
Thanks. I'll run this by the developer. It's likely to be at the default of 1. I'm trying to understand this. The description suggests "If at least X number of results are not found, then drop token(s) until enough results are found". If the default is 1, then it would be "If at least 1 result is not found, then drop token(s) until something is found". Our problem is not that the query returns too few (or no) results, it's that it produces far too many.
j
It would be: "If at least 1 result is not found with the words "plastic ball" in a single field, then drop token(s) until something is found". Meaning, search for records with "plastic", also search for records with "ball" and show all of those as results.
In other words if the original query doesn't product enough results, then it tries to fetch more results by changing the query
So the final results you're seeing are more in number because the original query resulted in too few results
m
Still a little fuzzy... sorry. Not sure where we set the minimum, but in this case we know there are at least several hits. Imagine one field is "Material", and another is "Product". We have 10 documents with "ball" in the Product field, and 500 products with the word "plastic" in the material field. We know there are exactly 3 plastic balls. We search for ball, and get 10 results (as expected). We search for "plastic ball", and get 510 results.
j
The first query Typesense will attempt to do is search for "plastic ball" in a single field, in a single document. In your case, it sounds like there is no record with both those words in the same field.
This is what drop_tokens_threshold picks up on
Since the full search query didn't produce any results, it will then drop "plastic" and search for "ball" and vice versa.
Could you share the full search query you're using with all the search params you're using though? There might be more factors at play here.
m
Ahhh... this makes sense now! I can ask our developer to contact you about it. You had emailed him when we first started this R&D. When we encountered this problem I thought it would be more appropriate to post questions here (and on Github) rather than having him contact directly. 🙂 The developer is in India, so I will have him reach out in his morning. @Jason Bosco Thanks so much.
👍 1
j
Posting on Github would be great actually, so it helps others who might have a similar question
m
@Jason Bosco Our developer emailed you. If/when we get this resolved we will pass it forward by posting on Gethub. 🙂
@Jason Bosco I had replied to all to Rajinder's email with additional information. Perhaps I went into your spam? I just resent.
j
I did get Rajinder's email and I responded to him, and he responded back as well.
I see you CCed in the thread
m
@Jason Bosco Yes. But you are apparently not receiving my emails. I had additional info that Rajender did not include...
j
Ah yes, I just noticed your responses in my spam folder. Hmm!