Hi <#C01P749MET0|>, I was wondering if there is a ...
# community-help
j
Hi #C01P749MET0, I was wondering if there is a possibility to modify the Text Match Score calculation. In Algolia I can do this here (see image). How does it work with TypeSense?
j
j
Yes, I also found this. It explains it but how can I modify the ranking criteria?
j
Using the sort_by parameter, along with these params: https://typesense.org/docs/0.22.2/api/documents.html#ranking-parameters
Could you give me an example of what you're trying to modify ranking-wise?
j
I want to show suggestions in form of an autocomplete to the user. The suggestions are just simple documents with one attribute
value
. I want to make sure that when I start typing e.g. "a" that then only terms starting with "a" are first shown, no matter if there are other terms that include more "a" but not in the beginning.
j
We don't account for word position in Typesense at the moment. Could you open a Github issue to track this, along with this example?
j
Oh, that's a surprise. I will create a Github issue.
j
One workaround in the meantime would be to index the first word in a separate field, search on that for autocomplete, but at display time show the other field which has all the words.
j
Interesting suggestion, but I don't want to apply this only to the first word but to multiple words if they exist together.
j
You could add both the first_word field and the full field in that order to query_by. Could you try this on 0.23.0.rc69?
j
So the full_term field should contain the whole term or the whole term except the first word?
j
Let's try whole term except the first word…
j
@Jason Bosco I'm trying it right out. Do I have to sort it in a specific way?
j
Not sure yet, how does it look with the default sort?
j
With the suggested solution I get right now these results which aren't still correct.
j
Do you have a code sandbox I can play around with?
j
Unfortunately for this example I don't because I use docker on my local machine.
j
If you can may be setup something like ngrok temporarily that would work as well...
j
Thanks, I will have a look.
@Jason Bosco Here it is. Using "An" for example, you will see that the result are not correct. I'm also not sure why it highlights the whole word instead of just the term I typed in. https://codesandbox.io/s/typesense-autocomplete-with-first-word-attribute-h7y1p1?file=/src/index.js
j
@Johannes Köhler Could you upgrade to
0.24.0.rc1
(this should be available on docker hub) and let me know? We made some changes to relevance algorithms there, which I think will help with your use case. This build also has a change to highlighting, where single characters are highlighted, instead of the whole work on match.
j
@Jason Bosco I updated the docker now to this new version. Unfortunately, the order of the results is still not correct.
j
@Johannes Köhler I set
query_by: 'first_word'
and these are the results I see. Does this line up with what you're looking to do?
j
@Jason Bosco Yes, it is definitely an improvement but it would obviously fail when you continue writing a second word.
j
Ah yes, my bad 🤦‍♂️ @Johannes Köhler Ok here's another way to do this: you want to index a new field called say "search_string" and then remove all spaces when you create this field, at indexing time. So for eg, you would index "Anterior part of the inferior surface of cerebrum" as:
Copy code
{
  search_string: "Anteriorpartoftheinferiorsurfaceofcerebrum",
  display_words: "Anterior part of the inferior surface of cerebrum"
}
And then set these search params:
Copy code
{
  ...
  query_by: "search_string",
  highlight_full_fields: "display_words"
}
We're essentially getting Typesense treat the whole string as one word when searching, but then at display time we show a different field
j
Ok, thanks I will test it. I just wonder why you don't add the word position to the search ranking?
j
Just a matter of bandwidth (time / effort) 🙂
We definitely want to support it
j
I understand. But I can't imagine that I'm the only one who would like to have this. It feels quite essential.
j
We've had may be 4 or 5 asks for it over the years, but it seemed like it wasn't important enough for anyone to document the ask in a Github issue (until you did recently). So I'd imagine not all use-cases require start of words prioritization specifically.
j
Hmm, I see. Just to clarify. It isn't only about the first word. It is more about the distance of the matching string to the beginning of the term. Anyway, you guys know better what is important for you.
j
It is more about the distance of the matching string to the beginning of the term.
Yup yup. That's the general use-case.
@Johannes Köhler Even if we had this feature, I'm wondering it that would help your specific use-case. For eg, if there were two records with title: "Function of the brain" "Brain function" and the search query is "Brai", this feature would rank the results as: 1. "Brain function" 2. "Function of the brain" Since "Brain" appears earlier in the field in result #1. Key thing is that word position is a ranking signal, and doesn't exclude any results. But in your use-case it sounds like you'd want to not show #2 at all, since it doesn't start with "Brai" in the first word right?
j
No, I also want to show #2. For example search for "a" I would expect a b a b a b b a a b b b b a b a b a b b b b a b b b b a
👍 1
Just another thought that I had today. Specifically for the suggestions feature I would expect this ranking as well a b a a b b b a b a b a Basically, the distance of the first word is important, and only when it is the same, then the amount or even the distances of the other word are taken into account.
j
I think this should already be covered in how we're thinking about this feature. I'll keep you posted.
👍 1
k
I've a build available for testing position based text match. Do you have a local dev environment setup that you can test with or do you use Typesense Cloud?
j
Hi @Kishore Nallan, yes I have a local docker instance running which I made publicly available with a service.
k
@Johannes Köhler This is available in
typesense/typesense:0.24.0.rc2
Docker build. You need to send a
prioritize_token_position=true
flag to the search query to enable this feature.
j
Perfect, I will test it as soon as possible. Thanks!
👍 1
@Kishore Nallan It looks alreay very good. Well done! Just found one issue. Have a look a this picture. Shouldn't be "Ear" list as the first result? Also, "Outer ear" should be height then "Uterus, early proliferative phase", no?
k
Yes, the changes I made only takes into consideration the positional information with that flag. To make "ear" rank first, we should also then consider shorter text to be more relevant than longer text.
j
Ah, I see.
k
This is not easy to do at the moment because we don't store the length of all the fields for each document and the inverted index only contains positional information for each word in the field.