#community-help

Token Priorities and Infix Search in Typesense Multi-word Queries

TLDR Sidharth sought guidance on creating multi-word query with token priority in Typesense. Kishore Nallan explained fetching results only for last word as prefix and suggested infix search and data modelling as potential solutions. However, Kishore Nallan emphasized that infix doesn't support multiple words and is only recommend for small datasets.

Powered by Struct AI
17
9mo
Solved
Join the chat
Jan 03, 2023 (9 months ago)
Sidharth
Photo of md5-e787df664e9cf1bb94f37d4a96c4ea05
Sidharth
06:52 AM
Hello Folks
Is there a way to apply token priority on the multi word query
Eg, As of now for a query like this "rel ind"
we are getting results for only the token "ind"

Please guide with the parameter which can give the results such that
we can get results matching for different keywords
and
with token priority -> "rel" get 1st priority then "ind" and so on
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:55 AM
Does rel actually match a word in the dataset?
06:55
Kishore Nallan
06:55 AM
Typesense does a prefix search only on the last word in the query.
Sidharth
Photo of md5-e787df664e9cf1bb94f37d4a96c4ea05
Sidharth
07:04 AM
rel is a prefix of a word in the database
eg. reliance
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:22 AM
Yes so that's why it's not matching. Only last word is prefix searched since that's what is useful in a typeahead autocomplete use case.
Sidharth
Photo of md5-e787df664e9cf1bb94f37d4a96c4ea05
Sidharth
07:41 AM
For our use-case we don't want typeahead feature

Can you guide on some parameters which do following:-
• samples with more words getting priority.
◦ eg. for query rel ind priortize results where "rel" & "ind" is getting matched
• Further, rel getting priority over ind
08:52
Sidharth
08:52 AM
Kishore Nallan Could you please guide us on the above use-case
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:41 AM
If you want all parts of a query to match then you have to use infix search option but that won't be very fast.
09:42
Kishore Nallan
09:42 AM
There's no way to prioritize one word in a query over another.

You probably need to think about how you can model your data so that you can achieve what you want. It will be difficult for me to advise you on the modelling unless I understood your use case better.
Sidharth
Photo of md5-e787df664e9cf1bb94f37d4a96c4ea05
Sidharth
09:49 AM
Ohk sure
10:14
Sidharth
10:14 AM
Hello Kishore Nallan
Can we apply infix on multiple words in a query
Currently, for an example rel ind we are getting match with highlight as below in which infix is getting applied on only first token,
'highlights': [{'field': 'tradingSymbol',
     'matched_tokens': ['RELIANCE'],
     'snippet': '<mark>RELIANCE</mark>'},
    {'field': 'name',
     'matched_tokens': ['RELIANCE'],
     'snippet': '<mark>RELIANCE</mark> INDUSTRIES LTD'},
    {'field': 'synonymField',
     'matched_tokens': ['Reliance', 'Reliance', 'Reliance'],

But, you can see that in the output infix operation is not applied on ind
We wanted to apply infix on subsequent words. as well as shown below
'snippet': '<mark>RELIANCE</mark> <mark>INDUSTRIES</mark> LTD'}
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
10:22 AM
I will have to check on that, will get back to you.
Sidharth
Photo of md5-e787df664e9cf1bb94f37d4a96c4ea05
Sidharth
10:27 AM
Sure, thanks
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
11:16 AM
Sidharth

The infix search is meant for handling searching of identifiers like model number etc so it actually only searches on the first word in the query. This is why the highlight is not working as expected.

Taking a step back, I think the best way for you to achieve what you want is to generate those 2-3 char combinations of tokens yourself and index them in a separate array field which you can include as part of your query_by field list.
11:17
Kishore Nallan
11:17 AM
The other option is for us to add a feature to use prefix search against all the tokens in the query. Happy to discuss the specifics of that on DM.
Sidharth
Photo of md5-e787df664e9cf1bb94f37d4a96c4ea05
Sidharth
11:47 AM
Just to confirm my understanding
Currently TypeSense do not support infix match on multiple word, right?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:12 PM
Yes correct, and infix search is a O(N) operation so it's for a very specific case for small datasets. We don't recommend it on high traffic or large data use cases.