#community-help

Issue with Upsert Duplicating Documents Due to Nested ID

TLDR Ed was encountering duplicate documents when using upsert. Jason explained that the 'id' must be a top-level key to prevent this issue.

Powered by Struct AI

1

3
4mo
Solved
Join the chat
Aug 02, 2023 (4 months ago)
Ed
Photo of md5-120c789e9edae8b90bf59cf0e2612b66
Ed
04:44 PM
trying to user upsert but I see my docs duplicating
 # Import documents into the collection
    upsert = client.collections[collection_name].documents.import_(
        jsonl_data.encode("utf-8"), {"action": "upsert"}
    )

example data:
  "data": {
            "idFS": "xx",
            "jobNumber": "xxx",
            "applicationUrl": "/xxx",
            "idClient": "870844",
            "id": "870844"
}
  "content": {xxx}
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:46 PM
id needs to be top level key.

De-duplication does not work when id is nested inside another field
Ed
Photo of md5-120c789e9edae8b90bf59cf0e2612b66
Ed
04:47 PM
got it, thanks

1