Running ML models on CPU is computationally intensive and so writes will be slow.
You would either have to slow down the pace of writes when you see a 503 (which is Typesense’s back pressure mechanism to prevent saturation CPU) or add a GPU to speed up embedding generation