Hi guys, Deleted Collection Still Indexed on Disk...
# community-help
a
Hi guys, Deleted Collection Still Indexed on Disk After Docker Restart I am experiencing an issue where a deleted collection still gets indexed from disk after restarting my Typesense Docker container. Here are the details: I have two collections: A large collection (~6.5 million vectors) that I need to delete. Another collection that must remain intact. After deleting the large collection using the API, it seems to persist on disk. When I restart the Docker container, it re-indexes the deleted collection, causing long startup times. I cannot clear the entire Typesense volume, as it contains the second collection, which I need to keep. Could you please advise on how to properly delete a collection so that it does not persist on disk after a restart? Is there a way to manually remove it from disk without affecting other collections? Looking forward to your guidance.
j
In Typesense, writes are first written to a write-ahead-log (WAL) on disk and then they are applied to the in-memory index. This WAL is compacted every hour to a snapshot. On restart, the last snapshot is loaded and then the last hour of the WAL is replayed. So when you delete and then immediately restart before the snapshot has happened, it's essentially reloading the entire collection from the previous snapshot and then it replays the WAL. To avoid this you can snapshot manually using the snapshot endpoint before a restart. And then that latest snapshot will no longer have the data you're trying to delete
a
Got it, thank you 🙌
👍 1