#community-help

Handling Performance with Large Document Collection

TLDR Zhen asked for advice on dealing with a document collection, concerned about affecting the query's performance with a planned update. Kishore Nallan explained the update would cause memory overhead and might prompt a redesign for scalability.

Powered by Struct AI
raised_hands1
7
9mo
Solved
Join the chat
Dec 06, 2022 (9 months ago)
Zhen
Photo of md5-4dfb5d22f4011c76b348de581fb4af0c
Zhen
06:33 AM
Hi everyone, I am looking for some suggestions.

I have a collection, where the documents in it need to be sorted by a field called receive_time which is unique to every user that has access to the document.

The document now looks something like this:
{
    ... other document data,
    receive_time: {
        [user id 1]: {receive time 1},
        [user id 2]: {receive time 2}
    }
}

I am thinking of updating the collection schema with the following field:

{
    name: '^receive_time.*',
    type: 'int64',
    facet: true
}

However, since the id is unique to every user, I imagine there would be a lot of fields being created in the schema.

Will this affect the performance of the query in general? If yes, is there any suggestion to solve this?

Thanks!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:35 AM
Every field has a set of overhead associated with it. So this will certainly cause memory overhead. Will the users be in thousands or millions?
06:36
Kishore Nallan
06:36 AM
Your document size could also be large because you have to list every user in a separate field.
Zhen
Photo of md5-4dfb5d22f4011c76b348de581fb4af0c
Zhen
06:46 AM
Only a few (around 3 ~ 4) users will have access to each document.

For now, my system is not that large yet, so there are only around 800 users, but if it were to scale to thousands of users, the solution that I mentioned will not be suitable right?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:48 AM
Yes, at that point you probably need to redesign this. For now, current design will work.
06:49
Kishore Nallan
06:49 AM
There are no easier alternatives I can suggest now.
Zhen
Photo of md5-4dfb5d22f4011c76b348de581fb4af0c
Zhen
06:54 AM
Ok sure, thanks for your help! Appreciate it.
raised_hands1