Array Field Autocomplete Issue in Typesense Migration
TLDR Kanwei encountered issues with autocomplete when migrating from Elasticsearch to Typesense. Jason and Kishore Nallan identified it as a bug and instructed Kanwei to create a GitHub issue.
1
Mar 17, 2023 (9 months ago)
Kanwei
02:03 PMWe have a "company" schema with company_name and an array field called subsidiary_names with the company's subsidiaries.
We have an autocomplete/prefix query searching on both company_name and subsidiary_names. It works, except that when searching on subsidiary_names, it seems to comingle the entries. For example, if there's a company with subsidiaries of ["Hawaii Electric", "Oahu Power"] and you search for "Hawaii Power" both entries get matched
Also, it seems like Typesense will match as long as there's a single match. For example, if the company name is "Hawaii Electric" but you search for "hawaii electric asdfasdf" it still considers it a match. Any way to change this behavior? elasticsearch doesn't do this by default
Jason
04:35 PMCould you expand on what you mean by “both entries” here? Do you mean it’s matching records with company_name as
Hawaii Power
and also records where the subsidiary field has Hawaii Power
in it?Jason
04:36 PMThis is behavior is controlled by
drop_tokens_threshold
which is set to 1
by default. If you set it to 0, it will give you the behavior you’re describing. Documented under this table here: https://typesense.org/docs/0.24.0/api/search.html#typo-tolerance-parametersKanwei
05:29 PM1
Kanwei
05:53 PMKanwei
05:54 PMKanwei
05:55 PMKanwei
05:55 PMKanwei
05:55 PMJason
06:06 PMKanwei
06:30 PMKanwei
06:46 PMKanwei
06:47 PMKanwei
06:47 PMMar 20, 2023 (9 months ago)
Kanwei
03:27 PMJason
04:29 PMWant to make sure I understand the issue fully
Kanwei
11:24 PMMar 21, 2023 (9 months ago)
Jason
03:47 AMKishore Nallan
11:15 AMfoo bar
could end up being stored in 2 different elements of a single field and currently we are unable to account for this case.Can you please create a github issue here: https://github.com/typesense/typesense/issues? This is a bug and we need to fix it.
Kanwei
03:41 PMTypesense
Indexed 3011 threads (79% resolved)
Similar Threads
Querying and Indexing Multiple Elements Issues
Krish queried fields with multiple elements, which Kishore Nallan suggested checking `drop_tokens_threshold`. Krish wished to force OR mode for token, but Kishore Nallan admitted the feature was missing. Krish was able to resolve the issue with url encoding.
Troubleshooting Issues with DocSearch Hits and Scraper Configuration
Rubai encountered issues with search result priorities and ellipsis. Jason helped debug the issue and suggested using different versions of typesense-docsearch.js, updating initialization parameters, and running the scraper on a Linux-based environment. The issues related to hits structure and scraper configuration were resolved.
Phrase Match Problem in Typesense Version 0.24.0rcn25
Robert was unsure about correct phrase match usage in Typesense. After providing Kishore Nallan with necessary data, Kishore Nallan was able to replicate the issue. Robert shared a Github link for further tracking, where Kishore Nallan responded later.
Issues with Repeated Words and Hyphen Queries in Typesense API
JinW discusses issues with repeated word queries and hyphen-containing queries in Typesense. Kishore Nallan offers possible solutions. During the discussion, Mr seeks advice on `token_separators` and how to send custom headers. Issues remain with repeated word queries.
Issue with Search Duration on Typesense Database
Robert reported an issue about query time delay when adding a `filter_by` constraint in a large Typesense database. Kishore Nallan explained that this happens due to the order of operation and also promised to look into this issue further. Robert withdrew his interest in sponsoring the improvement due to moving from the project.