#community-help

Explaining 'Group By' Function in Search Process

TLDR David asked about the 'group_by' function in search, and any performance costs. Kishore Nallan confirmed it uses runtime memory, but doesn't affect search performance. Further clarification about ranking of groups was given.

Powered by Struct AI

1

1

10
1w
Solved
Join the chat
Sep 16, 2023 (2 weeks ago)
David
Photo of md5-5c5edeceeb0deef59cc5dcc791ce7045
David
02:47 PM
is there a recommended way to require group_by returns all the hits in the group? Right now I have the group_limit set to 100, I don't think we'll ever have that large of a group but I wasn't sure if doing it that way is going to affect the search process
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:49 PM
Since grouping is currently implemented as an exhaustive operation, we anyway aggregate on all groups and truncate later. This might be optimized for later but for now a large group limit will incur more memory but will not affect performance.
David
Photo of md5-5c5edeceeb0deef59cc5dcc791ce7045
David
02:50 PM
more memory - does this mean during search runtime or persisted memory consumption?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:50 PM
Search run time
David
Photo of md5-5c5edeceeb0deef59cc5dcc791ce7045
David
02:50 PM
understood, thanks for the quick weekend reply!

1

02:52
David
02:52 PM
sorry - last question. You say "group by is implemented as an exhausted operation" - does this mean including "group_by" effectively sets exhaustive_search=true
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:52 PM
No that's different.
David
Photo of md5-5c5edeceeb0deef59cc5dcc791ce7045
David
02:52 PM
ok just making sure
04:33
David
04:33 PM
Jason sorry for ping, I see Kishore is offline now and was worried the thread would get ignored... when you have a moment, could you please explain how sort_by is interpreted when using group_by? for example, I want to sort by publishedDate:desc and I want to group by genre will the groups be sorted so that the genre with the most recent publishedDate is first?
Sep 17, 2023 (1 week ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:27 AM
In each group, the record with highest published date will be used as representative of that group for sorting. So group with a record with highest published date will appear first and so on.

1