#community-help

Disk Space Issue with Typesense Machine

TLDR John reported increasing disk space usage on their Typesense machine due to old indexes. Kishore Nallan suggested triggering a manual compaction and offered a command for it in the recent 0.24 RC build. John will test this solution next week.

Powered by Struct AI

2

1

Dec 07, 2022 (10 months ago)
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
08:21 AM
We’re running low on disk space on our Typesense machine and it seems like it’s because it’s keeping some old indexes or something (.sst files), because the collection hasn’t changed much but the disk usage keeps increasing. Any idea what this could be?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:22 AM
What version are you using?
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
08:22 AM
0.24.0.rcn34
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:23 AM
Do you have regular updates?
08:24
Kishore Nallan
08:24 AM
And what happens when you restart Typesense? Does the disk size get released?
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
08:27 AM
We create a new collection, repoint the alias, then delete the new collection. This happens once per hour.

No, it doesn’t seem to get released
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:31 AM
Oh I see, the compaction does not happen that often. We might have to expose an API to trigger that more aggressively
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
08:39 AM
How often does compaction happen?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:45 AM
Rocksdb has various levels and they get merged as each level crosses some predefined thresholds in size. We don't really change those defaults too much. I think there is also a global compaction that runs every 7 days but I need to check that.
08:47
Kishore Nallan
08:47 AM
How many SST files are there in db dir and what's their average size?
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
09:42 AM
~280 files
• ~80 files at 65M each, from Nov 22-Nov 29
• No SST files for a few days, then ~40 at 65M each, from Dec 3-Dec 4.
• Finally about 120 at 20M each, from Dec 5 to now.
Dec 08, 2022 (10 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:57 AM
I will see if we can add an operations API to trigger manual compaction of rocksdb. How much data do you write on a full refresh?

1

08:58
Kishore Nallan
08:58 AM
Btw, if you are creating snapshots explicitly by calling the snapshot API, then those files will be hard linked to the DB SST files and will prevent files from getting purged.
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
09:02 AM
About 500 MB data, and quite a lot of facets so the resulting indices are probably pretty large. We’re not creating snapshots explicitly.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
09:03 AM
Only raw data are stored on disk. So disk size will be more or less same as ingested data (some duplication happens due to buffering etc but indices are only in-memory)

1

09:04
Kishore Nallan
09:04 AM
It's surprising that the storage is growing unbounded this way. We have many customers with similar data size / usage pattern without any issues.
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
09:15 AM
It’s not really a big deal for us as long as the disk space is released at some point. In practice it just means that we have to make the disk a bit larger to accommodate. I’ll let you know if it continues to be an issue! Thanks.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:52 PM
John Can you please check the size of the db/LOG file?
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
04:03 PM
218M
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:08 PM
Ok then it's not that. Just realized that we need to set up proper rotation for that as well.
04:09
Kishore Nallan
04:09 PM
I will have that and an end-point for DB compaction in a couple of days in the next RC build.

1

Dec 22, 2022 (10 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
06:52 AM
John

In recent 0.24 RC builds (e.g. 0.24.0.rcn43) we have a way for you to trigger a compaction manually like this:

curl -X POST '' --header 'X-TYPESENSE-API-KEY: abcd'

This will restore the disk space immediately. Can you please try that out?
John
Photo of md5-21545f1facb7836c149bc4c70752bd2b
John
08:22 PM
Hey, I’m unfortunately not able to try that out for a few days, but hopefully next week.