Himank Chaudhary
07/20/2022, 8:04 AMKishore Nallan
07/20/2022, 8:08 AMdrop
? Do you delete the collection, then recreate it and then repopulate it?Himank Chaudhary
07/20/2022, 8:10 AMKishore Nallan
07/20/2022, 8:11 AMHimank Chaudhary
07/20/2022, 8:13 AMKishore Nallan
07/20/2022, 8:14 AMKishore Nallan
07/20/2022, 8:14 AMHimank Chaudhary
07/20/2022, 8:17 AMKishore Nallan
07/20/2022, 8:19 AMHimank Chaudhary
07/20/2022, 8:21 AMKishore Nallan
07/20/2022, 8:22 AMHimank Chaudhary
07/20/2022, 8:24 AMHimank Chaudhary
07/27/2022, 5:39 PMHimank Chaudhary
07/27/2022, 5:40 PMHimank Chaudhary
07/27/2022, 7:35 PMKishore Nallan
07/28/2022, 12:41 AMHimank Chaudhary
07/28/2022, 1:03 AMKishore Nallan
07/28/2022, 1:25 AMHimank Chaudhary
07/28/2022, 1:27 AMHimank Chaudhary
07/28/2022, 1:28 AMKishore Nallan
07/28/2022, 1:29 AMKishore Nallan
07/28/2022, 1:30 AMHimank Chaudhary
07/28/2022, 1:30 AMKishore Nallan
07/28/2022, 1:30 AMHimank Chaudhary
07/28/2022, 1:31 AMKishore Nallan
07/28/2022, 1:32 AMimport
endpoint?Himank Chaudhary
07/28/2022, 1:33 AMfor i in {1..100}; do curl "<http://localhost:8108//collections/companies/documents?action=upsert>" -d '{"company_name": "user6284786258713890", "num_employees": 1, "country": "ddddd"}'; done
Kishore Nallan
07/28/2022, 1:35 AMHimank Chaudhary
07/28/2022, 1:38 AMHimank Chaudhary
07/28/2022, 1:38 AMKishore Nallan
07/28/2022, 1:55 AMHimank Chaudhary
07/28/2022, 1:59 AMHimank Chaudhary
07/28/2022, 1:59 AMKishore Nallan
07/28/2022, 2:24 AMOvais Tariq
07/28/2022, 1:43 PMpboros
07/28/2022, 1:45 PMpboros
07/28/2022, 1:45 PMpboros
07/28/2022, 1:45 PMpboros
07/28/2022, 1:46 PMKishore Nallan
07/28/2022, 1:48 PMpboros
07/28/2022, 1:48 PMpboros
07/28/2022, 1:50 PMKishore Nallan
07/28/2022, 1:51 PMKishore Nallan
07/28/2022, 1:51 PMKishore Nallan
07/28/2022, 1:51 PMpboros
07/28/2022, 1:52 PMKishore Nallan
07/28/2022, 1:52 PMKishore Nallan
07/28/2022, 1:53 PMpboros
07/28/2022, 1:53 PMpboros
07/28/2022, 1:53 PMKishore Nallan
07/28/2022, 1:54 PMpboros
07/28/2022, 1:59 PMpboros
07/28/2022, 2:00 PM2022-07-28 14:56:52.331753 imported a batch
2022-07-28 14:56:52.331795 starting to delete coll2
Traceback (most recent call last):
File "/Users/pboros/ts_test.py", line 111, in <module>
main()
File "/Users/pboros/ts_test.py", line 86, in main
client.collections['coll2'].delete()
File "/Users/pboros/.virtualenvs/ts/lib/python3.9/site-packages/typesense/collection.py", line 25, in delete
return self.api_call.delete(self._endpoint_path())
File "/Users/pboros/.virtualenvs/ts/lib/python3.9/site-packages/typesense/api_call.py", line 156, in delete
return self.make_request(requests.delete, endpoint, True,
File "/Users/pboros/.virtualenvs/ts/lib/python3.9/site-packages/typesense/api_call.py", line 113, in make_request
raise ApiCall.get_exception(r.status_code)(r.status_code, error_message)
typesense.exceptions.ObjectNotFound: [Errno 404] No collection with name `coll2` found.
pboros
07/28/2022, 2:00 PMIn [4]: [c['name'] for c in client.collections.retrieve()]
Out[4]: ['coll1']
pboros
07/28/2022, 2:00 PMOvais Tariq
07/28/2022, 2:26 PMpboros
07/28/2022, 2:28 PMOvais Tariq
07/28/2022, 2:29 PMwhile(iter->Valid() && iter->key().starts_with(del_override_prefix)) {
store->remove(iter->key().ToString());
iter->Next();
}
Ovais Tariq
07/28/2022, 2:30 PMKishore Nallan
07/28/2022, 2:33 PMOvais Tariq
07/28/2022, 3:08 PMpboros
07/28/2022, 3:08 PMOvais Tariq
07/28/2022, 3:10 PMpboros
07/28/2022, 3:10 PMpboros
07/28/2022, 3:11 PMKishore Nallan
07/28/2022, 3:11 PMpboros
07/28/2022, 3:13 PMpboros
07/28/2022, 7:39 PMKishore Nallan
07/29/2022, 6:09 AMconnection_timeout_seconds
configuration in the client init is wrong: it should be on the same level as api_key
and not inside nodes
. This might explain the other `A collection with name coll1
already exists.` error.pboros
07/29/2022, 6:27 AMpboros
07/29/2022, 6:40 AMpboros
07/29/2022, 6:41 AMpboros
07/29/2022, 6:42 AMKishore Nallan
07/29/2022, 6:42 AMpboros
07/29/2022, 6:42 AMKishore Nallan
07/29/2022, 6:57 AMcoll1
is also hard coded here, so all small batch writes are going to only coll1 in this test: https://gist.github.com/pboros/8be8fa11f9f7a1e10d1a49bd46c1411e#file-ts_test-py-L57
Doesn't change anything much, just fyi.pboros
07/29/2022, 7:15 AMpboros
07/29/2022, 7:15 AMpboros
07/29/2022, 7:17 AMpboros
07/29/2022, 7:17 AMKishore Nallan
07/29/2022, 7:19 AMfor i in `seq 1 100`; do curl "<http://localhost:8108/health>" -X POST -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"; done
Kishore Nallan
07/29/2022, 7:20 AMpboros
07/29/2022, 7:22 AMpboros
07/29/2022, 7:22 AMpboros
07/29/2022, 7:23 AMpboros
07/29/2022, 7:24 AMpboros
07/29/2022, 7:24 AMKishore Nallan
07/29/2022, 7:35 AMpboros
07/29/2022, 7:36 AMpboros
07/29/2022, 7:36 AMTasks: 3 total, 1 running, 2 sleeping, 0 stopped, 0 zombie
%Cpu(s): 11.8 us, 0.9 sy, 0.0 ni, 87.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 16395664 total, 13430728 free, 319760 used, 2645176 buff/cache
KiB Swap: 1048572 total, 1048572 free, 0 used. 15374388 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 2759544 73112 21612 S 100.6 0.4 8:04.90 typesense-serve
318 root 20 0 18240 3312 2840 S 0.0 0.0 0:00.02 bash
328 root 20 0 36644 3052 2624 R 0.0 0.0 0:00.00 top
pboros
07/29/2022, 7:36 AMKishore Nallan
07/29/2022, 7:40 AMload_small
?pboros
07/29/2022, 7:50 AMKishore Nallan
07/29/2022, 7:52 AMpboros
07/29/2022, 7:53 AMpboros
07/29/2022, 8:34 AMKishore Nallan
07/29/2022, 10:43 AMCONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
07cb5d874739 dazzling_varahamihira 0.38% 141.9MiB / 3.841GiB 3.61% 516MB / 2.46MB 371MB / 27.9GB 231
pboros
07/29/2022, 10:45 AMKishore Nallan
07/29/2022, 10:45 AMpboros
07/29/2022, 10:46 AMpboros
07/29/2022, 10:46 AMpboros
07/29/2022, 10:47 AMKishore Nallan
07/29/2022, 10:47 AMpboros
07/29/2022, 10:49 AMpboros
07/29/2022, 10:50 AMKishore Nallan
07/29/2022, 10:51 AMKishore Nallan
07/29/2022, 10:52 AMKishore Nallan
07/29/2022, 11:03 AMKishore Nallan
07/29/2022, 11:05 AMKishore Nallan
07/29/2022, 11:07 AMpboros
07/29/2022, 11:09 AMpboros
07/29/2022, 11:09 AMroot@45652a6758b7:/# cat /proc/1/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes unlimited unlimited processes
Max open files 1048576 1048576 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 63590 63590 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Kishore Nallan
07/29/2022, 12:31 PMstrace
during the high CPU usage and get a capture of it? On Mac/Docker strace does not seem to work. Otherwise I have to launch a Linux instance to try later.pboros
07/29/2022, 12:32 PMpboros
07/29/2022, 12:33 PMpboros
07/29/2022, 12:34 PMtop -H
you can see what thread it is, in my case now it's 231:
231 root 20 0 2751372 79668 21612 R 99.9 0.5 2:21.33 typesense-serve
7 root 20 0 2751372 79668 21612 S 0.3 0.5 0:02.26 typesense-serve
pboros
07/29/2022, 12:36 PMpboros
07/29/2022, 12:36 PMKishore Nallan
07/29/2022, 12:36 PMKishore Nallan
07/29/2022, 12:37 PMpboros
07/29/2022, 12:40 PMKishore Nallan
07/29/2022, 12:44 PM# Does not slow down
for i in `seq 2 100`; do curl "<http://localhost:8108/collections/coll1/documents?action=upsert>" -X POST -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d "{\"id\": \"$i\", \"field1\": \"foo\", \"field2\": \"bar\"}"; done
This slows down and pegs CPU:
client = get_client()
load_small(client, "coll1", 100)
pboros
07/29/2022, 12:47 PMKishore Nallan
07/29/2022, 12:47 PMpboros
07/29/2022, 12:47 PMpboros
07/29/2022, 12:50 PMpboros
07/29/2022, 12:50 PMroot@3942c8bd03f4:/# strace -p 232 2>&1 | tee -a strace.out
strace: Process 232 attached
futex(0x7fbc2c4ea0cc, FUTEX_WAIT_PRIVATE, 207, NULL
pboros
07/29/2022, 12:51 PMpboros
07/29/2022, 12:52 PMpboros
07/29/2022, 12:52 PMKishore Nallan
07/29/2022, 12:56 PMpboros
07/29/2022, 12:57 PMpboros
07/29/2022, 12:58 PMKishore Nallan
07/29/2022, 12:59 PMdb
directory.Kishore Nallan
07/29/2022, 12:59 PMpboros
07/29/2022, 12:59 PMpboros
07/29/2022, 1:00 PMroot@3942c8bd03f4:/tmp# du -sh db/
77M db/
pboros
07/29/2022, 1:00 PMKishore Nallan
07/29/2022, 1:00 PMpboros
07/29/2022, 1:00 PMpboros
07/29/2022, 1:01 PMDevice: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 10.00 0.00 15.00 0.00 0.10 13.33 0.01 0.33 0.00 0.33 0.67 1.00
avg-cpu: %user %nice %system %iowait %steal %idle
11.65 0.00 1.00 0.13 0.00 87.22
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
vda 0.00 11.88 0.00 11.88 0.00 0.09 16.00 0.01 0.33 0.00 0.33 0.75 0.89
Kishore Nallan
07/29/2022, 1:01 PMpboros
07/29/2022, 1:02 PMpboros
07/29/2022, 1:04 PMpboros
07/29/2022, 1:04 PM{
"delete_latency_ms": 0,
"delete_requests_per_second": 0,
"import_latency_ms": 0,
"import_requests_per_second": 0,
"latency_ms": {
"GET /health": 0,
"POST /collections/epnbiqxgrr/documents/": 470.34285714285716,
"POST /collections/imkwwholpv/documents/": 651
},
"pending_write_batches": 1,
"requests_per_second": {
"GET /health": 0.1,
"POST /collections/epnbiqxgrr/documents/": 3.5,
"POST /collections/imkwwholpv/documents/": 0.5
},
"search_latency_ms": 0,
"search_requests_per_second": 0,
"total_requests_per_second": 4.1,
"write_latency_ms": 492.925,
"write_requests_per_second": 4
}
pboros
07/29/2022, 1:04 PMKishore Nallan
07/29/2022, 1:05 PMKishore Nallan
07/29/2022, 1:06 PMI20220729 18:33:31.825273 92155904 batched_indexer.cpp:67] batch indexer queue notify
I20220729 18:33:33.282867 87326720 batched_indexer.cpp:137] batch indexer queue awake
Kishore Nallan
07/29/2022, 1:07 PMKishore Nallan
07/29/2022, 1:07 PMpboros
07/29/2022, 1:08 PMpboros
07/29/2022, 1:08 PMKishore Nallan
07/29/2022, 1:09 PMpboros
07/29/2022, 1:11 PMKishore Nallan
07/29/2022, 1:13 PMKishore Nallan
07/29/2022, 3:01 PMrocksdb::DBIter::Next()
in batched indexer, so I figured it is taking too long to do db iterations after dropping so many keys.
I did a full db compaction after the deletion of the keys and that seems to have helped. I initially tried range compaction on only the key range that was deleted but that didn't help either. Right now, I'm doing a blind compaction as part of every collection drop, but have to probably change it to happen only when a "large" collection is dropped. Not sure if there is another criteria (like some kind of key fragmentation) to use here.Ovais Tariq
07/29/2022, 3:03 PMKishore Nallan
07/29/2022, 3:06 PMpboros
07/29/2022, 3:06 PMpboros
07/29/2022, 3:07 PMKishore Nallan
07/29/2022, 3:08 PMpboros
07/29/2022, 3:08 PMKishore Nallan
07/29/2022, 3:08 PMWhen deleting lots of rows, some of the SST files might be filled with tombstones and affected range scan performance. We extended RocksDB’s compaction to track long close-by tombstones. If a key range with high density of tombstones is detected, it immediately triggers another compaction in order to compact them away. This helped to reduce range scan performance skews caused by scanning excessed number of tombstones. In RocksDB, CompactOnDeletionCollectorFactory is the right class to do that.This seems relevant.
Ovais Tariq
07/29/2022, 3:09 PMKishore Nallan
07/29/2022, 3:09 PMpboros
07/29/2022, 3:09 PMOvais Tariq
07/29/2022, 3:09 PMHimank Chaudhary
07/29/2022, 4:35 PMKishore Nallan
07/29/2022, 4:58 PMHimank Chaudhary
07/29/2022, 4:59 PMOvais Tariq
07/29/2022, 6:07 PMOvais Tariq
07/29/2022, 6:07 PMKishore Nallan
08/01/2022, 1:35 PMNewCompactOnDeletionCollectorFactory
as described here seems to be the most generic way to account for this: https://github.com/facebook/rocksdb/blob/7fd68b7c39c76b5741533ee2a4d7024f3b513f2d/include/rocksdb/utilities/table_properties_collectors.h#L85Ovais Tariq
08/02/2022, 1:44 PMOvais Tariq
08/02/2022, 1:44 PMKishore Nallan
08/03/2022, 2:58 AMKishore Nallan
08/05/2022, 7:28 AMNewCompactOnDeletionCollectorFactory
does not help. New writes post deletion still takes as long: I don't even see an improvement. It's not clear to me when this compaction will kick-in. While I do see compaction related logs while deletion happens, it's not clear if it's related to this and in any case, doesn't help. So far the only thing that helps is an outright full DB compaction right after the deletion. Even compacting the range of the deleted collection keys doesn't help.Ovais Tariq
08/05/2022, 1:13 PMOvais Tariq
08/05/2022, 1:14 PMKishore Nallan
08/05/2022, 1:14 PMKishore Nallan
08/05/2022, 1:14 PMKishore Nallan
08/05/2022, 1:15 PMKishore Nallan
08/05/2022, 1:15 PMKishore Nallan
08/05/2022, 1:18 PMread_options.iterate_upper_bound
which has been recommended. I will check that.Ovais Tariq
08/05/2022, 1:20 PMOvais Tariq
08/05/2022, 1:20 PMKishore Nallan
08/05/2022, 1:21 PMOvais Tariq
08/05/2022, 1:25 PMKishore Nallan
08/05/2022, 1:26 PMoptions.table_properties_collector_factories.emplace_back(
rocksdb::NewCompactOnDeletionCollectorFactory(10000, 7500, 0.5));
Kishore Nallan
08/05/2022, 1:28 PMKishore Nallan
08/05/2022, 1:28 PMOvais Tariq
08/05/2022, 1:28 PMOvais Tariq
08/05/2022, 1:32 PMKishore Nallan
08/06/2022, 10:11 AMoptions.level0_file_num_compaction_trigger = 1
(default: 4). Level 0 compaction did not trigger previously because there were not enough level_0 files to trigger that default value. I think if there were sustained writes after the deletion, it would have eventually happened. Need to still understand the implication of setting this value too low but the impact of large number of level-0 files are explained succinctly in this paragraph:
RocksDB level 0 SST files are different from other levels. Memtables are flushed to disk as level 0 SST files. The data in a level 0 SST file are arranged in the order of generation as the memtable, and data from the same regions can be spread across different files. Also, data with the same key can be in multiple SST files. Therefore, when a read occurs, each level 0 SST file must be read in order. The more level 0 SST files, the more likely query read performance will be impacted negatively.
Ovais Tariq
08/06/2022, 1:02 PMOvais Tariq
08/06/2022, 1:03 PMKishore Nallan
08/06/2022, 1:10 PM20M 000906.sst
21M 000909.sst
10M 001039.sst
9.9M 001040.sst
4.0K 001105.sst
L0: 0
L1: 1 (2 KB)
L2: 4 (60 MB)
Ovais Tariq
08/06/2022, 1:12 PMOvais Tariq
08/06/2022, 1:12 PMKishore Nallan
08/06/2022, 1:14 PMLOG
file, which matched the file sizes on disk.Ovais Tariq
08/06/2022, 1:14 PMOvais Tariq
08/06/2022, 1:14 PMKishore Nallan
08/06/2022, 1:15 PM4.0K 001136.sst
16K 001138.sst
Nothing logged in LOG file on this run. I've to probably use some rocksdb CLI tool or something to get what levels these are on.Ovais Tariq
08/06/2022, 1:16 PMKishore Nallan
08/06/2022, 1:17 PMmax_bytes_for_level_base
is default of 256M so I am not able to understand how anything would ever end up in L2 because total size ~60M even without compaction.Kishore Nallan
08/06/2022, 1:20 PMOvais Tariq
08/06/2022, 1:23 PMOvais Tariq
08/06/2022, 1:49 PMOvais Tariq
08/06/2022, 1:50 PMKishore Nallan
08/06/2022, 1:53 PM.next()
call which is taking time and not the initial iterator seek
call. This is because the next call is not bounded and according to another comment on RocksDB GH, setting an upper bound on the iterator will help. This is because the deleted key range exists after the seek's prefix so maybe without an upper bound the iterator keeps looking at all those tombstones that can be skipped (there's an open but abandoned PR to handle this scenario).Ovais Tariq
08/06/2022, 1:58 PMOvais Tariq
08/06/2022, 1:59 PMOvais Tariq
08/06/2022, 2:02 PMOvais Tariq
08/06/2022, 2:02 PMKishore Nallan
08/06/2022, 2:02 PMlevel_compaction_dynamic_level_bytes
on existing DBs, so need to plan how to do that.Kishore Nallan
08/06/2022, 2:03 PMOvais Tariq
08/06/2022, 2:08 PMOvais Tariq
08/06/2022, 2:18 PMOvais Tariq
08/06/2022, 2:18 PMKishore Nallan
08/06/2022, 2:28 PMOvais Tariq
08/06/2022, 2:30 PMKishore Nallan
08/08/2022, 1:29 PMOvais Tariq
08/08/2022, 6:04 PMpboros
08/09/2022, 6:43 AMKishore Nallan
08/09/2022, 6:47 AMKishore Nallan
08/09/2022, 6:49 AMKishore Nallan
08/09/2022, 1:29 PMtypesense/typesense:0.24.0.rc28
Docker image that contains the iterator upper bound patch that has eliminated the post-bulk-delete write latencies for me locally.Kishore Nallan
08/10/2022, 4:50 AMNewCompactOnDeletionCollectorFactory
-- this only works for single deletes and not for range deletes. If range delete is used, then the use of factory is obsolete. I've already switched the collection deletion to range delete as well, so I think we should be good now.Ovais Tariq
08/10/2022, 3:26 PMpboros
08/10/2022, 3:30 PMKishore Nallan
08/10/2022, 3:38 PMOvais Tariq
08/10/2022, 4:06 PM