Error in restore node 2 typesense #community-help

Join Slack

Error in restore node 2

# community-help

Nelson Moncada

10/14/2021, 5:24 PM

Error in restore node 2

Nelson Moncada

10/14/2021, 6:32 PM

Add Log

output.log

Nelson Moncada

10/14/2021, 6:32 PM

@Jason Bosco @Kishore Nallan

Nelson Moncada

10/14/2021, 6:35 PM

stoped and stop again in moment to restore

Jason Bosco

10/14/2021, 6:35 PM

Uh oh, that looks like a bug in how we handle potentially badly formatted data... Could you open a Github issue for this with the full log you posted above, so we can track this?

Nelson Moncada

10/14/2021, 6:36 PM

Nelson Moncada

10/14/2021, 6:45 PM

error in client python

Nelson Moncada

10/14/2021, 7:34 PM

public Issue @Jason Bosco https://github.com/typesense/typesense/issues/403

Jason Bosco

10/14/2021, 7:36 PM

Thank you, will take a closer look

Nelson Moncada

10/14/2021, 8:56 PM

The Node 1 and 3 its run. I create one snapshot to node 3 and delete data in node 2 and copy snapshot to node 2 and start, Produce the same error when the data is indexing

Nelson Moncada

10/14/2021, 8:57 PM

I create one snapshot to node 1 and delete data in node 2 and copy snapshot to node 2 and start, Produce the same error when the data is indexing again

Nelson Moncada

10/14/2021, 8:57 PM

message has been deleted

Jason Bosco

10/14/2021, 9:26 PM

In general, you do not want to copy data like that between nodes manually. Since each node stores its own state information that's tied to its IP address internally

Jason Bosco

10/14/2021, 9:26 PM

So copying the data directory between nodes in the same cluster will cause the cluster state to get corrupted

Jason Bosco

10/14/2021, 9:27 PM

If you need to reset a node, you want to stop the Typesense process on that node, delete the data directory and then restart the Typesense process on that node. That node will then reach out to the other nodes in the cluster, get a snapshot via its own internal mechanism and recover by itself

Nelson Moncada

10/14/2021, 9:46 PM

delete only db directory?

Jason Bosco

10/14/2021, 9:46 PM

No you'd need to delete all data in the typesense dir

Nelson Moncada

10/14/2021, 9:48 PM

What files or directories deleted? or that endpoint we were to make the elimination?

Jason Bosco

10/14/2021, 9:49 PM

You'd have to essentially do

rm -rf /data/typesense/*

in your case

Nelson Moncada

10/14/2021, 9:50 PM

this endpoint create one snapshot only one node??

Jason Bosco

10/14/2021, 9:50 PM

Correct

Jason Bosco

10/14/2021, 9:51 PM

If you use a snapshot generated from that API endpoint, you'd have to create a standalone 1-node cluster first with that snapshot, and then add nodes 2 and 3 after Node 1 is fully up

Nelson Moncada

10/14/2021, 9:51 PM

To restore the same node? or serve for the other cluster nodes?

Nelson Moncada

10/14/2021, 9:52 PM

mmmm ok

Nelson Moncada

10/14/2021, 9:57 PM

Then the snapshot, despite being only one node, serves to lift the entire cluster. In theory then the first node of the cluster is configured with that snapshot and then spreads the information to the other nodes?

Jason Bosco

10/14/2021, 9:57 PM

Yup, exactly

Nelson Moncada

10/14/2021, 9:57 PM

mmm ok

Nelson Moncada

10/14/2021, 10:02 PM

I explain what I did and work. First create the Snapshot with the Endpoint. Remove the entire data from the data folder. Copy the STATE folder that is inside the snapshot folder, run the container and self-configured.

Nelson Moncada

10/14/2021, 10:02 PM

and the data is the same of that of the other nodes

Nelson Moncada

10/14/2021, 10:05 PM

@Jason Bosco that speed up the time to raise the node in the case of replicating the data on the node

Jason Bosco

10/14/2021, 10:06 PM

I see, I haven't tried to do this myself, but good to know

👍 1

🙌 1

Jason Bosco

10/14/2021, 10:07 PM

I usually just let the new nodes just catch up from other cluster nodes

✅ 1

😃 1

👍 1

Nelson Moncada

10/14/2021, 10:07 PM

because there was a long time, and before faults of this kind I can not lie down all the cluster and then lift it

Kishore Nallan

10/15/2021, 1:33 AM

@Nelson Moncada is this still reproduceable?

Nelson Moncada

10/15/2021, 1:41 AM

I delete container docker and copy snapshot to directory of data and run other container new. This resolved problem.

Kishore Nallan

10/15/2021, 1:47 AM

Okay if this happens again, I will be happy to look further. The proper way to do start a new node is to start with empty data directory. The node will be able to pull everything from the current leader (one of the other 2 nodes will be a leader). You don't have to do any manual snapshot and data transfer.

Open in Slack

Previous Next