#community-help

Resolving "Not Ready or Lagging" Error and Improving Upsert Performance

TLDR Anton experienced a "Not Ready or Lagging" error when deleting collections and upserting new data. Kishore Nallan identified it was caused by the server lagging behind in writes, and suggested increasing the write lag threshold configuration. Kishore Nallan also mentioned a future build that could improve upsert performance.

Powered by Struct AI
Jun 16, 2021 (32 months ago)
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
12:39 PM
hi i get this error
{ \"message\": \"Not Ready or Lagging\"}
when i try to delete my collections. Do you know what may cause this issue?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:41 PM
This indicates the the node is lagging behind in writes.
12:41
Kishore Nallan
12:41 PM
Check the --healthy-read-lag and --healthy-write-lag parameters here: https://typesense.org/docs/0.20.0/guide/configure-typesense.html#using-command-line-arguments
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
12:41 PM
also, sometimes i get this message during import:
12:41
Anton
12:41 PM
Post \"http://116.203.26.214:8108/collections\": read tcp 192.168.1.50:53795->116.203.26.214:8108: wsarecv: An existing connection was forcibly closed by the remote host.
12:43
Anton
12:43 PM
ok Kishore Nallan, i'll check, thanks for the answer
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:49 PM
I'm not sure about the other forcibly closed error. It indicates that the server has closed the connection with a RST packet, but it can also mean that some form of client / tcp timeout causing a connection drop. Finally (and least likely), check if the server itself crashed.
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
12:51 PM
Kishore Nallan server is alive, but still responds with {
    "message": "Not Ready or Lagging"
}
for several minutes already
12:52
Anton
12:52 PM
i tried restarting the service
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:52 PM
Is this on Typesense cloud or self hosted?
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
12:52 PM
selftr hosted
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:52 PM
What do the logs say?
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
12:52 PM
it just woke up
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
12:52 PM
I think it was probably restarting and was catching up.
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
12:56 PM
i'll try again and see if it repeats
Jun 21, 2021 (31 months ago)
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
03:47 PM
hello Kishore Nallan. I was able to reproduce the issue(Not Ready or Lagging) several times.
03:47
Anton
03:47 PM
Right now the server doesnt allow me to upsert new batch of data
03:48
Anton
03:48 PM
i saved several .log files from the several sessions where this problem occurred
03:48
Anton
03:48 PM
could you please help me to review the log to identify the problem?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:49 PM
Okay, the server will be logging offsets -- if you can send a few lines of that, we can tell how big the lag is. But the problem is essentially about server lagging behind ingesting new data.

I presume this is happening during updates, correct?
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
03:52 PM
during upsert
03:53
Anton
03:53 PM
03:53
Anton
03:53 PM
this is the full log
03:53
Anton
03:53 PM
Kishore Nallan could you pls let me know if you know the reason?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:19 PM
The writes are lagging behind, that is you are pushing upserts too fast and that is exceeding the write lag threshold configuration of 1000.

See this snippet in the log line:

last_index index: 29278 ... applying_index: 27909
04:19
Kishore Nallan
04:19 PM
29278 - 27909 = 1369 entries.
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
04:21 PM
hmm i see... Is it safe/ok to increase this threshold parameter?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:21 PM
To solve this, increase the --healthy-read-lag and --healthy-write-lag values to higher values. This way the server won't reject writes.
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
04:21 PM
okay, i will try, thank you
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:21 PM
It depends on your business use case. If a bit of lag makes no difference to your business, then it makes no difference.
04:21
Kishore Nallan
04:21 PM
It's basically a back pressure mechanism to warn you.
04:22
Kishore Nallan
04:22 PM
The upsert performance is going to be improved atleast 4-5x in the near future. I already have a build where it is much faster. It will be merged soon, and I can also share with you if you are interested once I finish benchmarking it further
04:23
Kishore Nallan
04:23 PM
If you run into this issue even after increasing threshold let me know, and we can try that build.
Jun 22, 2021 (31 months ago)
Anton
Photo of md5-c786a2b129225f079ed14d65bea3e29b
Anton
08:07 AM
thank you!

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Slow, High CPU Write Operations After Collection Drop in Typesense

Himank discussed an issue in Typesense where deleting and recreating a collection led to slow write operations and high CPU usage. Kishore Nallan suggested using an alias to avoid this issue. Numerous tests and debugging was conducted as pboros contributed with local testing. Kishore Nallan aimed to start implementing a range delete and full db compaction after deletion to potentially solve the issue.

20

232
17mo

Handling Kinesis Stream Event Batching with Typesense

Dui had questions about how to handle Kinesis stream events with Typesense. Kishore Nallan suggested using upsert mode for creation/update and differentiating with logical deletion. After various discussions including identifying and resolving a bug, they finalized to introduce an `emplace` action in Typesense v0.23.

8

91
24mo

Large JSONL Documents Import Issue & Resolution

Suraj was having trouble loading large JSONL documents into Typesense server. After several discussions and attempts, it was discovered that the issue was due to data quality. Once the team extracted the data again, the upload process worked smoothly.

run

4

94
9mo

Resolve Connection Error on Records Upsert

Jainil faced consistent connection errors while upserting records. Jason identified an OOM issue, suggesting a capacity upgrade. Auto-scaling was discussed and the upgrade implementation process, which was in progress, was clarified.

1

11
3mo

Typesense Server Bulk Import/Upsert Issue Resolved

Adam was confused about the discrepancy between the successful responses and the actual indexed data while working with a custom WP plugin integrating with Typesense. The issue was a bug related to fetching documents in the wrong order, not a Typesense problem.

2

22
7mo