#community-help

Resolve Facets and Sorting Issues with Typesense

TLDR Ethan needed assistance with getting all facet values and sorting results by date using Typesense. Jason provided guidance on how to use Typesense properties to accomplish these tasks, and resolved issues related to specific use-cases provided by Ethan and Rushil.

Powered by Struct AI

10

1

Jul 25, 2023 (4 months ago)
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
11:07 PM
Hey everyone - is there a way to just get all the facet values for a particular facet?
Jul 26, 2023 (4 months ago)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
12:00 AM
You want to do facet_by: fieldName and max_facet_values: 9999999 (some large number)
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
12:32 AM
is there a way to do this with the searchClient from TypesenseInstantSearchAdapter? noticed it doesnt have the search method
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
12:46 AM
If you’re using Instantsearch.js, you want to use the limit and showMoreLimit parameters in RefinementList or any of the other filter widgets, which the adapter then uses internally to set facet_by and max_facet_values
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
12:47 AM
we’re using react-instantsearch
12:47
Ethan
12:47 AM
but want to have the facet refinementlist fixed with all options and not disappear as options are selected
12:48
Jason
12:48 AM
> https://typesense-community.slack.com/archives/C01P749MET0/p1690332444060009?thread_ts=1690326451.922009&cid=C01P749MET0
Could you elaborate on this? This is how I’ve seen refinementList work by default
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
12:48 AM
yes but the default behavior is that facets with no hits wont be shown - we want to show all facets regardless
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
12:49 AM
Ah hmm
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
12:49 AM
with algolia we had to fetch the facets separately and pass those in with transformItems
12:53
Ethan
12:53 AM
for example: we have a category facet with a marketing option, if I first filter by internships then this facet option wont appear in our refinement list
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
12:54 AM
I see, you would have to do similar with Typesense as well
12:56
Jason
12:56 AM
The typesense-js search client can be accessed using adapter.typesenseClient
12:57
Jason
12:57 AM
which is an instance of the typesense-js client
12:57
Jason
12:57 AM
You should then be able to do a standard typesense search like this: https://typesense.org/docs/0.24.1/api/search.html
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
12:58 AM
by typesenseClient do you mean SearchClient
Image 1 for by typesenseClient do you mean SearchClient
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
01:00 AM
ah ok yes its just not in the Type declaration
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
01:02 AM
Ah, will add it

1

01:46
Jason
01:46 AM
Added this in v2.7.1-3 of the adapter
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
02:35 AM
amazing thanks!
02:35
Ethan
02:35 AM
another question: we want to be able to toggle sorting the results by date. With algolia we would conditionally change the index to a replica index sorted by date - is it similar with typesense?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
03:09 AM
In Typesense, a single collection (what Algolia calls an index) can be sorted on any number of fields without replica indices.
03:09
Jason
03:09 AM
You want to configure the sort by widget this way: https://github.com/typesense/typesense-instantsearch-adapter#sortby

Specifically the items array, in the format described in that section
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
06:38 AM
using the SortBy component and seems to send the correct request with sort_by but doesnt seem to work - no change between default and sorted
Image 1 for using the SortBy component and seems to send the correct request with sort_by but doesnt seem to work - no change between default and sorted
06:52
Ethan
06:52 AM
is _text_match:desc default for every query? or should it be specified and somehow our default sort is already updates_date:desc
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:33 PM
To be sure, are you saying that the query sent to Typesense has the correct sort_by parameter when the sortBy widget value is changed in the UI, but the results returned are not sorted by the sort_by field?

If so, could you copy-as-curl the network request to Typesense for both sort orders and DM it to me?
04:34
Jason
04:34 PM
> is _text_match:desc default for every query?
If no sort_by is mentioned, yes this is default.

If a sort_by is mentioned, _text_match:desc is added as the last sort parameter, after the sorts you’ve specified in sort_by
07:52
Jason
07:52 PM
It looks like you have "default_sorting_field": "updated_date" in the collection schema. So when you don’t specify a sort_by parameter in the search, Typesense automatically sorts by updated_date:desc. This is why the results look similar when you add sort_by=updated_date:desc
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
07:53 PM
ah yes makes sense. if we set to sort by to _text_match:desc will that overwrite it
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:54 PM
Yes, but only if q is something other than *, which is when text_match is calculated. Otherwise text_match is identical for a wild-card * query

1

Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
07:57 PM
Awesome makes sense. One more thing - we are trying to get similar results that we had with algolia but it seems like the default typesense (only copied over the query_by) is having lower quality results. First half of video (localhost) is typesense, second half (simplify) is algolia. Any tips/advice on improving results (the main issue here is the 30 results from the same company in a row)
Image 1 for Awesome makes sense. One more thing - we are trying to get similar results that we had with algolia but it seems like the default typesense (only copied over the query_by) is having lower quality results. First half of video (localhost) is typesense, second half (simplify) is algolia. Any tips/advice on improving results (the main issue here is the 30 results from the same company in a row)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:58 PM
Could you share a screenshot of your Algolia ranking settings and also the section called “grouping and distinct” in your index configuration?
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
08:00 PM
Image 1 for
08:01
Ethan
08:01 PM
Image 1 for
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:03 PM
Hmm, just to be sure, are the datasets identical between what you have in your Algolia index vs Typesense collection (including the updated_date timestamps)?
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
08:04 PM
yeah they should be pretty close currently - they were synced yesterday. As more time passes algolia will be outdated since we stopped updating
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:09 PM
Are you sending that query to your Typesense Cloud cluster? Because when I search for “software” against the dataset in your cluster I see a different set of search results (you should see the same in your Typesense Cloud web console)
08:10
Jason
08:10 PM
In any case, what’s happening here is that the word “software” being present in all the titles will result in a tie in text match score and then to break the tie, Typesense will use the default_sorting_field which in your case is updated_date, and if that also ties, then Typesense will use the document insertion timestamp to break tie, and finally the document ID
08:11
Jason
08:11 PM
So if many job postings were updated / created at around the same time from the same company with the title “software” then they will show up next to each other
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
08:11 PM
I see so documents that score equally with text will then use updated_date which will likely be all jobs from the same company since scraped together
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:11 PM
Right
08:12
Jason
08:12 PM
If you’re ideal ranking behavior is to only show one job posting from a single company for a given search term, you want to use group_by: company_name and set group_limit: 1
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
08:12 PM
perhaps using a different default_sorting_field would be better
08:13
Ethan
08:13 PM
its not that we want to necessarily limit company to 1 but 30 in a row is too much 😆
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:13 PM
Haha! If there’s an upper limit, say 5 or 10, then group_limit and group_by would be the best way to avoid this issue completely

1

Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
08:15 PM
prioritize_token_position may also be good here - have you seen many people use it to improve results
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:17 PM
Yeah, I was thinking of that, but for your particular domain, I guess if someone calls a position “Software Engineer” and another position “Senior Software Engineer”, and the search term is software, token position doesn’t necessarily mean one job post is more relevant than the other
Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
08:20 PM
yes in some queries it would make sense but others may not
08:21
Ethan
08:21 PM
search is hard! haha thanks for the prompt responses with everything!

1

1

08:34
Ethan
08:34 PM
Is default sort order required?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:46 PM
No it’s optional
Rushil
Photo of md5-153deb374053266af0ef241c0c4ac510
Rushil
09:21 PM
Is there anyway to update the default_sorting_field?
09:32
Rushil
09:32 PM
I think we figured out why Algolia has the different results. The default sorting mechanism is objectId. It seems like this isn’t possible on Typesense though
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:35 PM
> Is there anyway to update the default_sorting_field?
Unfortunately no. You would have to create a new collection and reindex your data in it.

1

09:36
Jason
09:36 PM
> The default sorting mechanism is objectId. It seems like this isn’t possible on Typesense though
Do you let Algolia auto-generate these objectIds or do you generate them on your side?
Rushil
Photo of md5-153deb374053266af0ef241c0c4ac510
Rushil
09:36 PM
Figured this was the case. We’ve setup aliases now, so upgrading the schemas is easier
09:37
Rushil
09:37 PM
> Do you let Algolia auto-generate these objectIds or do you generate them on your side?
We generate the object IDs and sync them to Algolia. We’re doing the same thing with Typesense with id
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:38 PM
Ok so the equivalent in Typesense is just called id which is used to dedupe records. However, you can’t use the id field for sorting. So you would have to duplicate the same id in a new field called say sortId and then use that field for sorting

1

Rushil
Photo of md5-153deb374053266af0ef241c0c4ac510
Rushil
09:50 PM
We just set this up and now have a posting_id field. I think I may have misread something, but can you sort string fields? I’m running into errors when trying
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:51 PM
Sorting is disabled on string fields by default (since it requires additional memory). So you would have to set sort: true in the schema for that field definition

1

09:51
Jason
09:51 PM
Also, string fields can only be used for sorting in sort_by, not in default_sorting_field.

But you can mimic the default_sorting_field behavior, by just using sort_by explicitly in the search query

1

Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
09:52 PM
Can the sort_by take multiple parameters? So for example if we want the first sort to be updated_date and the second/tiebreaker to be posting_id
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:52 PM
Yup
Rushil
Photo of md5-153deb374053266af0ef241c0c4ac510
Rushil
09:52 PM
Yeah it can
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:53 PM
You can use up to 3 tie breakers in sort_by
09:54
Jason
09:54 PM
If you need more, then you want to create a single field using any formula that weights different criteria (based on your business logic) and come up with a single weight score for each record, and then sort by that

1

Ethan
Photo of md5-43493fa2e131ee4b59d15256c7c083bf
Ethan
09:55 PM
With the sort by widget what would be the format for applying multiple sort_by? Would be just be comma separated like: "jobs/sort/updated_date:desc,posting_id:desc”
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:57 PM
That’s correct

1

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Querying with Typesense-Js and Handling Null Values

michtio was querying using typesense-js and receiving fewer results than expected. Kishore Nallan suggested using different query parameters. Further discussion led to the handling of 'null' values and filtering syntax in the search queries. The thread ended with Jason offering migration support from Algolia to Typesense.

4

39
17mo

Query on Facet Values, `max_facet_values` , and `facet_query_num_typos`.

Jan asked about sorting facet values, managing `max_facet_values` and issues with `facet_query_num_typos`. Jason clarified the details on instantsearch widget handling of `max_facet_values` and identified a bug on the Typesense Server. Jason suggested a solution to the sorting issue.

2

23
1mo

Range Filtering and Faceting Discussion

Phil asked about the requirements for range filtering, which Jason explained does not always require faceting. Discussion about different possibilities with Algolia and Typesense ensued, resulting in Phil successfully utilizing the 'Configure' widget.

2

48
28mo

Typesense Sorting Query and Bug Report

michtio is new to Typesense and is querying sorting implementation. Jason shares resources and suggests use of specific adapters. However, michtio shares a bug found when applying a filter on Typesense. Jason asks for a GitHub issue to be opened for the bug.

14
18mo

Troubleshooting Typesense Setup and Understanding Facets and Keywords

Demitri encountered errors when exploring Typesense for the first time. Jason guided them through troubleshooting and discussed facets, keyword settings, and widget configurations. Helin shared a Python demo app and its source code to help Demitri with their project.

1

56
21mo