Hi all. When updating a document, how fast will th...
# community-help
e
Hi all. When updating a document, how fast will this change take effect?
j
When the API call returns, the update has already been processed and will show up in search
e
Even if I'm using firebase extension?
I did a query and the value didn't change
j
Oh the firebase extension delay - that I'm not sure. It depends on how fast Firestore calls the extension code after a document change
I was talking about the time between the Typesense API call being made and when it is processed
e
even though I checked in the dashboard that the document was changed
But when I perform a query using shell
the value is the old one.
j
I've not heard of any delays here though, unless it errored out. Could you check the Firebase function logs for the extension and see if any errors show up?
e
The logs only show that the document was being upserting with the right values.
j
Hmm, if no errors show up there, then it should have made its way to Typesense. Could you make sure you're looking at the right collection, etc?
Also, could you refresh the Typesense Cloud dashboard and try just in case
e
Interesting. When I modify the document the first time
The value changes in firebase and I'm seeing that it the extension is working at it should
But the change doesn't take effect
When I change the document a second time, everything goes as usual but this time I can see the change in typesense too.
In short, I have to write the value 2 times before I see the change in typesense.
j
I suspect it's the search cache in Typesense at play then. Could you try hitting the Typesense API directly via curl, do a GET on the document with its ID directly to confirm this?
e
I changed the value 5 times. First time • Change Firebase, Typesense-Ext executed, Typesense document result (Using curl) = old parameter. Second time • Same as
first time
Third Time • Wait to change the document, perform same steps, Typesense document result (Using curl) = new parameter.
j
Hmmm, hang on. Let me try to replicate this...
e
Ok
j
Hmmm, I can't seem to replicate it
Could you share your cluster ID?
e
We're using test servers for testing. We don't have a cluster Id yet.
j
I can't think of a reason why it would be flaky though... It should either work or not work fully. Unless there's a network connection issue between firestore and your server
Could you double check that there's enough RAM as well on the server?
e
We're running typesense on 3 nodes with 1GB RAM + 32GB storage.
j
I also tested on a 3 node cluster. Could you do a GET /debug on all three nodes and post the output?
I wonder if one of the nodes is not part of the cluster
e
Server 1 Debug output:
Copy code
{
  "state": 1,
  "version": "0.22.1"
}
Server 2 Debug output:
Copy code
{
  "state": 1,
  "version": "0.22.1"
}
Server 3 Debug output:
Copy code
{
  "state": 1,
  "version": "0.22.1"
}
j
Yup, that is indeed the issue. If a cluster was successfully established between all 3 nodes, you'll see
state: 1
on one node and
state: 4
on the other two nodes
e
Oh.
Ok.
How would I fix this?
j
Could you share your Typesense configs? and nodes file content from all 3 nodes
e
Yes
Server 1
Copy code
; Typesense Configuration

[server]

api-address = company-ip-1
api-port = 443
data-dir = /var/lib/typesense
log-dir = /var/log/typesense
api-key = typesense-api-key
peering-address = 192.168.199.212
peering-port = 8107
nodes = /etc/typesense/nodes
ssl-certificate = /etc/letsencrypt/live/test1.example.com/fullchain.pem
ssl-certificate-key = /etc/letsencrypt/live/test2.example.com/privkey.pem
Server 2
Copy code
; Typesense Configuration

[server]

api-address = company-ip-2
api-port = 443
data-dir = /var/lib/typesense
log-dir = /var/log/typesense
api-key = typesense-api-key
peering-address = 192.168.199.3
peering-port = 8107
nodes = /etc/typesense/nodes
ssl-certificate = /etc/letsencrypt/live/test2.example.com/fullchain.pem
ssl-certificate-key = /etc/letsencrypt/live/test2.example.com/privkey.pem
Server 3
Copy code
; Typesense Configuration

[server]

api-address = company-ip-3
api-port = 443
data-dir = /var/lib/typesense
log-dir = /var/log/typesense
api-key = typesense-api-key
peering-address = 192.168.199.25
peering-port = 8107
nodes = /etc/typesense/nodes
ssl-certificate = /etc/letsencrypt/live/test3.example.com/fullchain.pem
ssl-certificate-key = /etc/letsencrypt/live/test3.example.com/privkey.pem
/etc/typesense/nodes 192.168.199.2128107443,192.168.199.38107443,192.168.199.258107443
j
The configs seem fine... Could you use say telnet to ensure that port 8107 of all the nodes are accessible from within each node?
Eg:
telnet 192.168.199.3 8107
should show you a prompt when you run it from the other two hosts
e
Servers are seeing each other at layer 3.
but not telnet.
j
Yeah, something's up there then. Could you check firewall rules to make sure that port 8107 is allowed on all the nodes at least for the 192.168.x.x subnet
e
Yes
You were right. We configured the firewall to allow port 8107
Now all servers can telnet to port 8107
j
Great, if you now restart the typesense processes on all the servers, they should start forming a cluster
You want to double check but hitting the /debug endpoint
e
so, reboot all servers?
j
sudo systemctl restart typesense-server.service
should be sufficient ^
e
Nice. Now we have 1, 4, 4
j
🙌
🙌🏾 1
e
What happens if the servers had data?
j
They would have not been able to reconcile with each other since they most likely would have had different data, it sounded like writes were going to different nodes
So you would have had to delete the data dir on two of the nodes and then start the cluster so the third node can sync the data to the other two nodes
e
So, shutdown all nodes, erase data dir for 2 nodes, then restart all nodes again?
We need to delete data dir or the content inside the dir?
j
The content inside the data dir.
Could you first check the logs though
If it says "Peer Refresh Succeeded" on the node with state 1, then you're good. You don't have to delete the data dir
e
Where can I check these logs? using systemctl? debug only shows version and state.
j
/var/log/typesense/typesense.log
e
It seems it failed. We're getting the following:
Copy code
I20211215 05:19:19.205374   771 node.cpp:722] node default_group:192.168.199.3:8107:443 waits peer 192.168.199.212:8107:443 to catch up
I20211215 05:19:19.205453   771 node.cpp:722] node default_group:192.168.199.3:8107:443 waits peer 192.168.199.25:8107:443 to catch up
Copy code
W20211215 05:19:19.206537   774 replicator.cpp:392] Group default_group fail to issue RPC to 192.168.199.212:8107:443 _consecutive_error_times=11, [E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322 [R1][E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322 [R2][E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322 [R3][E112]Not connected to 192.168.199.212:8107 yet, server_id=206158430322


W20211215 05:19:19.206666   774 replicator.cpp:292] Group default_group fail to issue RPC to 192.168.199.25:8107:443 _consecutive_error_times=11, [E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684 [R1][E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684 [R2][E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684 [R3][E112]Not connected to 192.168.199.25:8107 yet, server_id=163208757684
Copy code
W20211215 05:19:20.806144   771 socket.cpp:1193] Fail to wait EPOLLOUT of fd=28: Connection timed out [110]
W20211215 05:19:20.806241   771 socket.cpp:1193] Fail to wait EPOLLOUT of fd=26: Connection timed out [110]
j
Hmm, "Connection timed out" is a different issue - sounds like one of the nodes might still have trouble connecting
In any case, I think it's good to clear the data dir from two of the nodes, and then restart them so they can catch up with the 3rd node
e
Ok. Let me check>
I'm getting the same result plus this:
Peer refresh failed, error: Peer 192.168.199.212:8107:443 failed to catch up
We've erased all folders from data dir on each node.
all 3 nodes are having the same logs
We're also having these at the beginning:
Copy code
Running GC for aborted requests, req map size: 0
I20211214 17:47:19.990471   673 raft_server.cpp:524] Term: 5, last_index index: 5, committed_index: 5, known_applied_index: 5, applying_index: 0, queued_writes: 0, pending_queue_size: 0, local_sequence: 1
j
Ok let’s try this. Could you stop all three processes, clear the data dir on all nodes, then on one of the nodes edit the nodes file to just have its own IP and start the Typesense process up. It should log “peer refresh succeeded”. Then add the 2nd nodes IPs into the nodes file and start the Typesense process on the 2nd node, same for 3rd node
The last log lines you shared are normal
e
Ok. Let me try it.
Should we edit /etc/typesense/nodes too?
j
Yes, by “nodes file” I meant edit /etc/typesense/nodes
e
Running systemctl status We get the following:
Copy code
Started Typesense Server.
Log directory is configured as: /var/log/typesense
Peer refresh failed, error: Doing another configuration change
j
You want to look at the var logs
It should eventually log peer refresh succeeded
e
It's interesting. The log shows the server trying to connect to the other 2 even though they are off and the /etc/typesense/nodes has the ip with ports (8107:443).
j
That means the data dir from the previous run is still intact
You want to stop the Typesense process, clear the data dir, make sure it’s fully empty and start the Typesense process again
e
That's weird. We stopped the service using systemctl stop typesense-server.service. Then we proceed to erase db, meta and state folders from /var/lib/typesense and then we proceed to restart the service.
Do you think we need to wait some time before restarting the service? Because it seems the service is using a cached config when restarting
k
Can you try stopping all 3 nodes, deleting the contents of data directory (
rm -rf /var/lib/typesense/*
) and starting them one by one again?
e
We'll do it. Let's us check.
It's still trying to connect to the other servers. We're thinking on a fresh installation and go from there.
k
Okay, there is no other place where Typesense stores the config. So if it is still using some old configuration, it means that somehow the data directory is not being cleared correctly.
e
Thanks. We'll start over the testing and put the results in this thread.
Thanks for all the help @Jason Bosco.
👍 1