Grouping and Faceting Denormalized Data
TLDR Phillip asked about grouping and faceting denormalized data. Viji provided a specific example for clarity. Jason confirmed Phillip's plan and suggested a different approach for consideration. Phillip acknowledged the advice.
1
Jun 08, 2022 (19 months ago)
Phillip
06:09 PMWe want to take the list of results from X and group them on our server into a list of Y objects which each have a list of
n
X documents.The problem we are trying to solve is how to do the faceting here.
1. We want to search over collection X and group it into Y objects.
2. We want the facet counts to be proportional to the number
n
of Y objects we end up with.This is our plan right now.
1. Retrieve the full list of X documents and group them into Y type objects on our server.
2. Have another typesense collection of Y type documents and do a second search for all the ids of the Y documents, which would return the facet counts proportional to the Y objects we have on the server.
Is this the most reasonable way to do this? Is there something we are missing?
Phillip
06:10 PMcc Viji Todd Willian
Jason
06:16 PMViji
08:06 PMWe have a Movies collection which has Movies with metadata about these Movies
We have a Movie Content collection which is the content of each Movie broken up into each dialogue in the Movie (each dialog is a separate document with its own metadata such as the actor gender). We thought we should denormalize this Movie Content collection by also adding the Movie metadata to this Movie Content collection.
We have complex searches that go against Movie metadata and also searching through Movie Content such as:
We want movies that are 30-60 minutes long released in the last 90 days with these producers from these countries where a female actor said X word or Y word but no male actor said Z word.
When we return these results, we want the counts for the facets to reflect the counts of movies that met the complex search requirements rather than the counts of dialogues that met those requirements.
Hope this helps!
Jason
08:22 PMThe plan Phillip mentioned above should work.
Another thing I'd recommend trying to see if it gives you what you're looking for is using
group_by
and may be grouping by the Movie ID, when searching through the Movie Content collectionPhillip
08:25 PMPhillip
08:25 PM1
Typesense
Indexed 3011 threads (79% resolved)
Similar Threads
User-Specific Tagging and Filtering in UI
bnfd asked for the best way to create user-specific tags available on the UI. Jason suggested using personalized filters and creating a separate collection for each user's movies. The duo clarified the use of 'tags' in schemas and the refinementList widget in instantsearch. They also discussed various approaches to import and search large document collections.
Understanding `facet_by` and `group_by` in Advanced Search
Emma queried about the function of `facet_by` and `group_by` in an advanced search scenario with structured data. Kishore Nallan clarified that facets during `group_by` are done on the groups.
Understanding Facet Results in Typesense
Prabhu had difficulty understanding the count results of his facet results in Typesense. Kishore Nallan explained the behaviour and suggested creating a Github issue for a feature request to modify count behaviour.