Hi < C01P749MET0|> I was wondering if there is a possibility typesense #community-help

Hi <#C01P749MET0|>, I was wondering if there is a ...

Johannes Köhler

05/24/2022, 11:45 AM

Hi #C01P749MET0, I was wondering if there is a possibility to modify the Text Match Score calculation. In Algolia I can do this here (see image). How does it work with TypeSense?

Jason Bosco

05/24/2022, 11:46 AM

Here's the equivalent in Typesense: https://typesense.org/docs/guide/ranking-and-relevance.html

Johannes Köhler

05/24/2022, 11:47 AM

Yes, I also found this. It explains it but how can I modify the ranking criteria?

Jason Bosco

05/24/2022, 11:51 AM

Using the sort_by parameter, along with these params: https://typesense.org/docs/0.22.2/api/documents.html#ranking-parameters

Jason Bosco

05/24/2022, 11:51 AM

Could you give me an example of what you're trying to modify ranking-wise?

Johannes Köhler

05/24/2022, 11:59 AM

I want to show suggestions in form of an autocomplete to the user. The suggestions are just simple documents with one attribute

value

. I want to make sure that when I start typing e.g. "a" that then only terms starting with "a" are first shown, no matter if there are other terms that include more "a" but not in the beginning.

Jason Bosco

05/24/2022, 12:03 PM

We don't account for word position in Typesense at the moment. Could you open a Github issue to track this, along with this example?

Johannes Köhler

05/24/2022, 12:04 PM

Oh, that's a surprise. I will create a Github issue.

Jason Bosco

05/24/2022, 12:27 PM

One workaround in the meantime would be to index the first word in a separate field, search on that for autocomplete, but at display time show the other field which has all the words.

Johannes Köhler

05/24/2022, 12:41 PM

Interesting suggestion, but I don't want to apply this only to the first word but to multiple words if they exist together.

Johannes Köhler

05/24/2022, 12:42 PM

I created a Github issue https://github.com/typesense/typesense/issues/601

Jason Bosco

05/24/2022, 1:03 PM

You could add both the first_word field and the full field in that order to query_by. Could you try this on 0.23.0.rc69?

Johannes Köhler

05/24/2022, 1:18 PM

So the full_term field should contain the whole term or the whole term except the first word?

Jason Bosco

05/24/2022, 1:37 PM

Let's try whole term except the first word…

Johannes Köhler

05/25/2022, 9:28 AM

@Jason Bosco I'm trying it right out. Do I have to sort it in a specific way?

Jason Bosco

05/25/2022, 12:58 PM

Not sure yet, how does it look with the default sort?

Johannes Köhler

05/25/2022, 3:27 PM

With the suggested solution I get right now these results which aren't still correct.

Jason Bosco

05/25/2022, 3:59 PM

Do you have a code sandbox I can play around with?

Johannes Köhler

05/25/2022, 4:00 PM

Unfortunately for this example I don't because I use docker on my local machine.

Jason Bosco

05/25/2022, 4:03 PM

If you can may be setup something like ngrok temporarily that would work as well...

Johannes Köhler

05/25/2022, 4:04 PM

Thanks, I will have a look.

Johannes Köhler

05/26/2022, 9:42 AM

@Jason Bosco Here it is. Using "An" for example, you will see that the result are not correct. I'm also not sure why it highlights the whole word instead of just the term I typed in. https://codesandbox.io/s/typesense-autocomplete-with-first-word-attribute-h7y1p1?file=/src/index.js

Jason Bosco

05/26/2022, 6:57 PM

@Johannes Köhler Could you upgrade to

0.24.0.rc1

(this should be available on docker hub) and let me know? We made some changes to relevance algorithms there, which I think will help with your use case. This build also has a change to highlighting, where single characters are highlighted, instead of the whole work on match.

Johannes Köhler

05/30/2022, 11:54 AM

@Jason Bosco I updated the docker now to this new version. Unfortunately, the order of the results is still not correct.

Jason Bosco

05/30/2022, 7:10 PM

@Johannes Köhler I set

query_by: 'first_word'

and these are the results I see. Does this line up with what you're looking to do?

Johannes Köhler

05/31/2022, 9:04 AM

@Jason Bosco Yes, it is definitely an improvement but it would obviously fail when you continue writing a second word.

Jason Bosco

05/31/2022, 3:07 PM

Ah yes, my bad 🤦‍♂️ @Johannes Köhler Ok here's another way to do this: you want to index a new field called say "search_string" and then remove all spaces when you create this field, at indexing time. So for eg, you would index "Anterior part of the inferior surface of cerebrum" as:

Copy code

{
  search_string: "Anteriorpartoftheinferiorsurfaceofcerebrum",
  display_words: "Anterior part of the inferior surface of cerebrum"
}

And then set these search params:

Copy code

{
  ...
  query_by: "search_string",
  highlight_full_fields: "display_words"
}

Jason Bosco

05/31/2022, 3:08 PM

We're essentially getting Typesense treat the whole string as one word when searching, but then at display time we show a different field

Johannes Köhler

05/31/2022, 3:13 PM

Ok, thanks I will test it. I just wonder why you don't add the word position to the search ranking?

Jason Bosco

05/31/2022, 3:13 PM

Just a matter of bandwidth (time / effort) 🙂

Jason Bosco

05/31/2022, 3:14 PM

We definitely want to support it

Johannes Köhler

05/31/2022, 3:17 PM

I understand. But I can't imagine that I'm the only one who would like to have this. It feels quite essential.

Jason Bosco

05/31/2022, 3:20 PM

We've had may be 4 or 5 asks for it over the years, but it seemed like it wasn't important enough for anyone to document the ask in a Github issue (until you did recently). So I'd imagine not all use-cases require start of words prioritization specifically.

Johannes Köhler

05/31/2022, 3:25 PM

Hmm, I see. Just to clarify. It isn't only about the first word. It is more about the distance of the matching string to the beginning of the term. Anyway, you guys know better what is important for you.

Jason Bosco

05/31/2022, 3:26 PM

It is more about the distance of the matching string to the beginning of the term.

Yup yup. That's the general use-case.

Jason Bosco

05/31/2022, 4:27 PM

@Johannes Köhler Even if we had this feature, I'm wondering it that would help your specific use-case. For eg, if there were two records with title: "Function of the brain" "Brain function" and the search query is "Brai", this feature would rank the results as: 1. "Brain function" 2. "Function of the brain" Since "Brain" appears earlier in the field in result #1. Key thing is that word position is a ranking signal, and doesn't exclude any results. But in your use-case it sounds like you'd want to not show #2 at all, since it doesn't start with "Brai" in the first word right?

Johannes Köhler

06/01/2022, 7:41 AM

No, I also want to show #2. For example search for "a" I would expect a b a b a b b a a b b b b a b a b a b b b b a b b b b a

👍 1

Johannes Köhler

06/01/2022, 2:47 PM

Just another thought that I had today. Specifically for the suggestions feature I would expect this ranking as well a b a a b b b a b a b a Basically, the distance of the first word is important, and only when it is the same, then the amount or even the distances of the other word are taken into account.

Jason Bosco

06/01/2022, 5:37 PM

I think this should already be covered in how we're thinking about this feature. I'll keep you posted.

👍 1

Kishore Nallan

06/07/2022, 9:41 AM

I've a build available for testing position based text match. Do you have a local dev environment setup that you can test with or do you use Typesense Cloud?

Johannes Köhler

06/07/2022, 1:26 PM

Hi @Kishore Nallan, yes I have a local docker instance running which I made publicly available with a service.

Kishore Nallan

06/08/2022, 1:02 PM

@Johannes Köhler This is available in

typesense/typesense:0.24.0.rc2

Docker build. You need to send a

prioritize_token_position=true

flag to the search query to enable this feature.

Johannes Köhler

06/08/2022, 1:02 PM

Perfect, I will test it as soon as possible. Thanks!

👍 1

Johannes Köhler

06/09/2022, 8:39 AM

@Kishore Nallan It looks alreay very good. Well done! Just found one issue. Have a look a this picture. Shouldn't be "Ear" list as the first result? Also, "Outer ear" should be height then "Uterus, early proliferative phase", no?

Johannes Köhler

06/09/2022, 8:41 AM

Here you canfind the update version of the example app https://codesandbox.io/s/typesense-autocomplete-with-first-word-attribute-h7y1p1?file=/src/index.js

Kishore Nallan

06/09/2022, 9:17 AM

Yes, the changes I made only takes into consideration the positional information with that flag. To make "ear" rank first, we should also then consider shorter text to be more relevant than longer text.

Johannes Köhler

06/09/2022, 9:18 AM

Ah, I see.

Kishore Nallan

06/09/2022, 9:39 AM

This is not easy to do at the moment because we don't store the length of all the fields for each document and the inverted index only contains positional information for each word in the field.

Open in Slack

Previous Next