Embedding generation is just an intensive process and doing this often on a two core machine is not going to be feasible if you also have lot of concurrent writes and reads. Even if we manage the read queue we will have to really end up slowing down indexing several times to have acceptable read times. GPU instances will be much more faster, maybe you can try that but that does cost more due to the unit economics.