Hi everyone, I'm experiencing inconsistencies wit...
# community-help
a
Hi everyone, I'm experiencing inconsistencies with search results in my Typesense collection of 80,000 product documents using CLIP model embeddings. The search results are mostly incorrect relative to the keywords, despite several troubleshooting attempts including disabling the cache and upgrading the cluster to v27.0. For instance, when I search for keywords like "book," "car," "television," or "led tv," the results are initially inaccurate. If I retry the search after a minute, the results momentarily align with the search terms before reverting to being incorrect in a few seconds. This pattern of fluctuation persists across different keywords. Could this be related to cluster settings, my query approach, or some other issue? Any guidance or suggestions would be greatly appreciated. Thanks for your help!
1
k
Hi Avi, do you use a Typesense cluster of multiple nodes to host these emebddings?
a
Yes, Kishore. My cluster config is: 3 Nodes Multi-Node HA Single Region - Frankfurt
k
Make explicit curl api request to each node and check if all of them return bad results.
a
I tried doing that but each node is giving inconsistent results.
k
Atleast one of them gives correct results? That might explain why sometimes it returns correct results.
a
Nope, each of them is inconsistent. I tried using each node for 10 minutes, and according to my observation, there is no fixed pattern of this issue. I can share a URL of my website build, it might help in understanding the cause.
k
Are you saying that when you query the same node, each time it returns a different set of results?
a
Yes.
I am sending you a DM with a screen recording and URL for you to test.
k
Got it. Can you try commenting out the
sort_by
field in the request parameters?
Interestingly when I search on your website, I get correct results for all queries.
a
Try again in a few seconds or minutes. Try the keyword "book" and "books"
I know it appears to be working fine at first, but that's why I recorded a video to capture the times when it doesn't work correctly.
I sent you another DM with screenshots that I took just now.
And I also tried with removing the sort option, but unfortunately still the same issue.
k
Yeah I see it now.
Are you making the API call from your backend?
a
Yes
k
Try making consecutive calls via the Typesense API directly via curl and see if the problem still happens. This is to rule out any application related state.
a
Okay, I will try that. And update you on this.
I did the test using a simple python code. And the issue is still the same. I even tried using individual nodes, but it didn't resolve it. I have DM you the screenshot.
k
To close this thread, increasing the
ef
search parameter helped make the results deterministic:
Copy code
"vector_query": "embedding:([], ef:100)",
The default value of
10
was making the index look at shallow results and also introduce some non-determinism.
a
Awesome, this resolved my issue. Thanks a lot once again for such a great support!
🙌 1