TLDR Md raised an issue about differing search results when using Postman and a Python library. Kishore Nallan suggested trying a multi_search request to compare values, and to set a distance threshold on the vector component. The issue was resolved.
Can you please post a small reproduceable example?
I am just getting the IDs for sequentially fetching from DB using django queryset `search_parameters1 = {` `'q': remove_trailing_whitespace(search),` `'query_by': 'embedding,title,job_responsibility,skills,industry,location,type,division,company',` `'include_fields': 'id',` `'per_page': 250,` `'page': 1,` `'sort_by': '_vector_distance:desc'` `}` `res = client.collections['alljobsupdated'].documents.search(search_parameters1)` `newlist = [x['document']['id'] for x in res['hits']]` `print(newlist)`
Here the sequence and list are different than fetching directly from postman like {server}/collections/alljobsupdated/documents/search?q=laravel&query_by=embedding,title,job_responsibility,skills,industry,location,type,division,company&sort_by=_vector_distance:desc&include_fields=id&page=250
Can you try a multi_search request with both python client and with postman and compare those values? I wonder if there is some url encoding issue because of the GET params
Multi search uses POST so we can rule that out
okay thanks bhai, I will try it
it is working now but sometimes returns irrelevant data , do we need to remove stop words before inserting on typesense ?
Irrelevant data can happen because of semantic search.
is there any possibilities to tweak the fusion mechanism dynamically?
Can you try setting a distance threshold on the vector component: ```vector_query=embedding:([], distance_threshold:0.30)``` In the snippet above, `embedding` is the name of the vector field. Try adding the above to your queries to see if it helps.
thank you for the support
Md
Tue, 22 Aug 2023 09:40:15 UTCI have got a problem where using postman I get different results for hybrid search and using python library it returns different results. typesense==0.16.0 (python client)