How can I know that indexing completed on a dataset typesense #community-help

Join Slack

How can I know that indexing completed on a datase...

# community-help

Thomas

03/02/2022, 11:44 AM

How can I know that indexing completed on a dataset?

Kishore Nallan

03/02/2022, 11:45 AM

When the endpoint response arrives, indexing is done. All endpoints are synchronous.

Thomas

03/02/2022, 11:46 AM

If we restart the instance, it take a long time before it's usable, like 30 minutes. It re-index on reboot?

Kishore Nallan

03/02/2022, 11:47 AM

Yes, only raw documents are stored on disk and indexing happens in memory on restart.

Thomas

03/02/2022, 11:48 AM

Alright, what's the bottleneck on that? CPU or Disk speed?

Kishore Nallan

03/02/2022, 11:48 AM

CPU. The latest 0.23 RC builds are faster in this respect.

Thomas

03/02/2022, 11:51 AM

How much faster is 0.23 RC?

Kishore Nallan

03/02/2022, 11:51 AM

Depends on dataset. Primary work is around numerical fields.

Thomas

03/02/2022, 11:52 AM

Dataset is majority text

Kishore Nallan

03/02/2022, 11:52 AM

We recommend running a 3 node configuration so rotationse can be done without a single point of failure.

Kishore Nallan

03/02/2022, 11:53 AM

There might be a few other things we can still do to optimize text fields.

Thomas

03/02/2022, 11:59 AM

Yeah we're starting with 3 nodes in a cluster

Thomas

03/02/2022, 11:59 AM

What's the optimal size in terms of keys?

Thomas

03/02/2022, 12:00 PM

We have 60 datapoints that need to filter on

Thomas

03/02/2022, 12:11 PM

@Kishore Nallan How often do the API break, could rolling upgrades on a cluster be a problem?

Kishore Nallan

03/02/2022, 12:11 PM

Every node stores all the data so nodes help in increasing throughput acorss many users.

Kishore Nallan

03/02/2022, 12:12 PM

We've successfully done 5 versions so far on Typesense cloud across hundreds of deployments. We take care about backward compatibility.

Kishore Nallan

03/02/2022, 12:14 PM

We store nothing but documents on disk so not much problem with upgrades.

Thomas

03/02/2022, 12:18 PM

Superb 🙂

Thomas

03/02/2022, 12:18 PM

Thank you

Kishore Nallan

03/02/2022, 12:18 PM

👍

Thomas

03/02/2022, 12:47 PM

@Kishore Nallan It's not possible to dump, periodically the RAM that's the index, to disk, so it doesn't need to be re-indexed on reboot?

💡 2

Kishore Nallan

03/02/2022, 2:37 PM

You could try doing this via CRIU: https://criu.org/Main_Page

👍 1

Thomas

03/02/2022, 2:48 PM

Yeah that's how we currently do it with KVM, but this doesn't help if there's a hardware issue

Open in Slack

Previous Next