#community-help

Querying String Arrays with Prefix=false and Typesense

TLDR Roman needed assistance with querying string arrays using prefix=false in Typesense. Kishore Nallan clarified how searching array elements works and advised using a plain string.

Powered by Struct AI

1

15
9mo
Solved
Join the chat
Feb 21, 2023 (10 months ago)
Roman
Photo of md5-f2a94a197618e1f63e59826a255d71f0
Roman
07:50 AM
Hi! I was hoping to get some assistance on prefix=false usage.
07:51
Roman
07:51 AM
It was originally discussed here. What we’re doing:
query_by has a single string[] field
• no additional params except prefix=false are set, we want to find documents that have both “react” (or its synonym) and “python” (or its synonyms) in string[] field that is being queried, we don’t need prefixed search for any of these terms
• there are synonyms defined for “react”: “reactjs,REACT.JS,reactJS,React js,React,React.JS,react js,react,react-jsx,react.js,Reactjs,React.Js,React.js,ReactJS,React JS”
• no synonyms for word “python” are set
07:51
Roman
07:51 AM
Results differ depending on word order in q, we’d expect Typesense to find the same amount of records for both “react python” and “python react”.
Image 1 for Results differ depending on word order in `q`, we’d expect Typesense to find the same amount of records for both “react python” and “python react”.Image 2 for Results differ depending on word order in `q`, we’d expect Typesense to find the same amount of records for both “react python” and “python react”.
07:55
Roman
07:55 AM
Interestingly, removing prefix=false makes result somewhat the same, but I think we want to keep it because we don’t want q like “python java” to result in documents containing “javascript”
Image 1 for Interestingly, removing `prefix=false` makes result somewhat the same, but I think we want to keep it because we don’t want `q` like “python java” to result in documents containing “javascript”Image 2 for Interestingly, removing `prefix=false` makes result somewhat the same, but I think we want to keep it because we don’t want `q` like “python java” to result in documents containing “javascript”
08:00
Roman
08:00 AM
There are some synonyms that contain word “react” like “create-react-app, cra, create react app” if that makes any difference
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:10 AM
When Typesense queries against a string array field, we still expect both tokens in the query to appear within a single array element. So react python will NOT match against a field value of ["react", "python"] because they are considered as separate elements.

1

08:12
Kishore Nallan
08:12 AM
To achieve that, either index all tags within a plain string field space separated. Or, use filter query, e.g. filter_by=tags:[react, python]
Roman
Photo of md5-f2a94a197618e1f63e59826a255d71f0
Roman
11:19 AM
Thanks!
Feb 22, 2023 (10 months ago)
Roman
Photo of md5-f2a94a197618e1f63e59826a255d71f0
Roman
04:15 PM
Wait I think I misread
> we still expect both tokens in the query to appear within a single array element
I don’t think so? I mean the screenshots above show how Typesense successfully queries against a string array field, it’s just the amount of results changes depending on whether “react python” or “python react” is set to q

Is there a way to tell Typesense that the order of words doesn’t matter? And keep prefix=false so “java” matches only to “java” and not “javascript”
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:22 PM
So the way it works is that first we try to find a both tokens and if not, then try to drop the tokens from right to left and then left to right until we find enough records. This is why you get the results.
Roman
Photo of md5-f2a94a197618e1f63e59826a255d71f0
Roman
04:28 PM
So it fails to find “python react scala” among strings in array, then looks for just “python react”, then looks for just “python” and returns documents that have “python” among strings in array?
04:31
Roman
04:31 PM
What would be the best way to search for documents that have tags: string[] containing all words present in q (or their synonyms)?
Feb 24, 2023 (9 months ago)
Roman
Photo of md5-f2a94a197618e1f63e59826a255d71f0
Roman
12:06 PM
Hi Kishore Nallan, was hoping you can advise on this
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:08 PM
Searches against an array will be comparing query words against individual elements of array and not across. So query of foo bar will NOT match [foo, bar] array.
12:08
Kishore Nallan
12:08 PM
You have to use a plain string.

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3011 threads (79% resolved)

Join Our Community