#community-help

Discussing Improved Multifield Search Options and Text Score Feature

TLDR bnfd asked Kishore Nallan about issue #516's improvements to multi-field searches and text score feature. Kishore Nallan explained how typo and drop tokens are now considered globally. They also discussed the text match score bucketing.

Powered by Struct AI

2

7
23mo
Solved
Join the chat
Feb 07, 2022 (23 months ago)
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
02:39 PM
Kishore Nallan In issue #516 I saw you mentioned improvement for multi-field searches, could you please explain a bit what that entails?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:42 PM
Sure, earlier the typo_tokens_threshold and drop_tokens_threshold were not always being applied at the global level (but only at a per-field level). For e.g. for the query foo if there was a name field with that value but no description field containing that word, Typesense used to query for typo variations of the word in the description index. This often confused people. Now, we are considering the typo and drop tokens threshold at the global level. If a record with atleast one field matches the query word, typo/drop token variations are not looked for in the other fields.

1

02:45
Kishore Nallan
02:45 PM
The other improvement, though not related to multi-field searching is the ability to do bucketing of text match scores so that the popularity score has a chance to be effective in the sorting. Usage:

_text_match(buckets: 10):desc 

The buckets parameter indicates the number of intervals that the text match scores should be divided into, such that the documents falling within these buckets would be deemed to have the same text matching score. When there are fewer documents than the number of buckets, then all documents will belong to the same bucket. Only the first 250 documents and their scores are bucketed this way. A value of 0 or 1 disables bucketing.
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
02:46 PM
Can we still apply typo/drop_token settings for each field separately?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:47 PM
It has never been a per-field parameter, even though it was being applied at a per-field level since all our indices are field-specific.
bnfd
Photo of md5-ca6495d5be926db80e09aabf066f4b8b
bnfd
02:49 PM
Is the text score a new feature? Related to https://github.com/typesense/typesense/issues/439 ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:56 PM
That issue is about breaking the text_match value that we return today into components that make it easier to relate to.

1