#community-help

Deduplicating Usernames in a Comments Collection

TLDR Todd had concerns about searching duplicated usernames in comments. Jason suggested facet_query, and SamHendley recommended group_by. Todd appreciated the help.

Powered by Struct AI
+11
10
10mo
Solved
Join the chat
Dec 02, 2022 (10 months ago)
Todd
Photo of md5-cccf0b87668408fef09dd77e1948fced
Todd
07:07 PM
We’re trying to search some data that is pretty heavily duplicated in a collection. Is there anyway to deduplicate matches? To give context, we have a collection of comments with a username field, but we want to try and search usernames using this collection. The concern we have is that we’d maybe only get one user back because the username field is so duplicated across comments that searching matches for a username will only return all comments from only the user with the highest match score.

Is there a way to only get back distinct usernames, or should we just have a separate collection of user information we search on instead?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:15 PM
You could do a facet_query for this
Todd
Photo of md5-cccf0b87668408fef09dd77e1948fced
Todd
07:17 PM
Right, but won’t that not give me ranking on my results in this case?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:18 PM
It will be ranked by the most popular usernames…
Todd
Photo of md5-cccf0b87668408fef09dd77e1948fced
Todd
07:18 PM
Most entry matches?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:18 PM
Correct
07:18
Jason
07:18 PM
If you need to control ranking beyond that, then you would have to put it in a separate collection
Todd
Photo of md5-cccf0b87668408fef09dd77e1948fced
Todd
07:19 PM
Very interesting stuff! Thank you very much for helping us find our way around.
+11
SamHendley
Photo of md5-a9a351e11d64f05b41fec183816a0cda
SamHendley
07:47 PM
also group_by might be approriate