Alan Buxton
09/11/2025, 2:33 PME20250911 07:31:34.140895 1174056 raft_server.cpp:783] 622 queued writes > healthy write lag of 500
I20250911 07:31:37.149612 1174056 raft_server.cpp:692] Term: 15, pending_queue: 0, last_index: 143407, committed: 143407, known_applied: 143407, applying: 0, pending_writes: 0, queued_writes: 622, local_sequence: 48889904
I20250911 07:31:37.149675 1174153 raft_server.h:60] Peer refresh succeeded!
E20250911 07:31:43.174098 1174056 raft_server.cpp:783] 622 queued writes > healthy write lag of 500
I20250911 07:31:47.192770 1174056 raft_server.cpp:692] Term: 15, pending_queue: 0, last_index: 143407, committed: 143407, known_applied: 143407, applying: 0, pending_writes: 0, queued_writes: 622, local_sequence: 48889904
I20250911 07:31:47.192821 1174143 raft_server.h:60] Peer refresh succeeded!
The 622 is not going down. And if I now try to post any more updates to typesense (even with a much smaller batch size than before), I get a typesense.exceptions.ServiceUnavailable: [Errno 503] Not Ready or Lagging
Any guidance on what to do in this situation?Jason Bosco
09/11/2025, 9:51 PMJason Bosco
09/11/2025, 9:51 PMAlan Buxton
09/12/2025, 4:48 AMI20250911 18:38:12.625999 1987628 raft_server.h:60] Peer refresh succeeded!
I20250911 18:38:14.904073 1987634 log.cpp:536] close a full segment. Current first_index: 2358 last_index: 2369 raft_sync_segments: 0 will_sync: 1 path: /path/to/typesense/data/state/log/log_00000000000000002358_00000000000000002369
I20250911 18:38:14.907173 1987634 log.cpp:550] Renamed `/path/to/typesense/data/state/log/log_inprogress_00000000000000002358' to `/path/to/typesense/data/state/log/log_00000000000000002358_00000000000000002369'
I20250911 18:38:14.907274 1987634 log.cpp:114] Created new segment `/path/to/typesense/data/state/log/log_inprogress_00000000000000002370' with fd=245
I20250911 18:38:22.676370 1987541 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 2377, committed: 2377, known_applied: 2377, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 301249
I20250911 18:38:22.676440 1987638 raft_server.h:60] Peer refresh succeeded!
E20250911 18:38:24.646929 1987228 default_variables.cpp:109] Fail to read stat
E20250911 18:38:24.647040 1987228 default_variables.cpp:232] Fail to read memory state
E20250911 18:38:24.647075 1987228 default_variables.cpp:294] Fail to read loadavg
E20250911 18:38:25.651759 1987228 default_variables.cpp:109] Fail to read stat
E20250911 18:38:25.651875 1987228 default_variables.cpp:232] Fail to read memory state
E20250911 18:38:25.651912 1987228 default_variables.cpp:294] Fail to read loadavg
E20250911 18:38:26.655475 1987228 default_variables.cpp:109] Fail to read statAlan Buxton
09/12/2025, 4:48 AMAlan Buxton
09/12/2025, 4:13 PMAlan Martini
09/12/2025, 4:58 PME20250911 07:31:34.140895 1174056 raft_server.cpp:783] 622 queued writes > healthy write lag of 500
I20250911 07:31:37.149612 1174056 raft_server.cpp:692] Term: 15, pending_queue: 0, last_index: 143407, committed: 143407, known_applied: 143407, applying: 0, pending_writes: 0, queued_writes: 622, local_sequence: 48889904
I20250911 07:31:37.149675 1174153 raft_server.h:60] Peer refresh succeeded!
E20250911 07:31:43.174098 1174056 raft_server.cpp:783] 622 queued writes > healthy write lag of 500
These logs usually show when your server is overloaded trying to process writes. If you have metrics, they will likely show CPU being heavily utilized.
Now these ones:
E20250911 18:38:24.646929 1987228 default_variables.cpp:109] Fail to read stat
E20250911 18:38:24.647040 1987228 default_variables.cpp:232] Fail to read memory state
E20250911 18:38:24.647075 1987228 default_variables.cpp:294] Fail to read loadavg
E20250911 18:38:25.651759 1987228 default_variables.cpp:109] Fail to read stat
E20250911 18:38:25.651875 1987228 default_variables.cpp:232] Fail to read memory state
E20250911 18:38:25.651912 1987228 default_variables.cpp:294] Fail to read loadavg
E20250911 18:38:26.655475 1987228 default_variables.cpp:109] Fail to read stat
I've never seen. Could you try restarting your typesense instance?Alan Buxton
09/15/2025, 11:55 AME20250915 12:36:39.520720 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.520793 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.520865 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.520937 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521010 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521085 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521157 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521229 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521299 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521371 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521445 2641817 collection.cpp:768] Write to disk failed. Will restore old document
E20250915 12:36:39.521517 2641817 collection.cpp:768] Write to disk failed. Will restore old document
And then a large number of "queued writes" that never goes down:
E20250915 12:40:04.091602 2641798 raft_server.cpp:771] 1447 queued writes > healthy read lag of 1000
E20250915 12:40:04.091656 2641798 raft_server.cpp:783] 1447 queued writes > healthy write lag of 500
I20250915 12:40:13.133855 2641798 raft_server.cpp:692] Term: 12, pending_queue: 0, last_index: 4273, committed: 4273, known_applied: 4273, applying: 0, pending_writes: 0, queued_writes: 1447, local_sequence: 1294612
E20250915 12:40:13.133998 2641798 raft_server.cpp:771] 1447 queued writes > healthy read lag of 1000
E20250915 12:40:13.134014 2641798 raft_server.cpp:783] 1447 queued writes > healthy write lag of 500
I20250915 12:40:13.134050 2641891 raft_server.h:60] Peer refresh succeeded!
E20250915 12:40:22.177606 2641798 raft_server.cpp:771] 1447 queued writes > healthy read lag of 1000
E20250915 12:40:22.177685 2641798 raft_server.cpp:783] 1447 queued writes > healthy write lag of 500
I20250915 12:40:23.182776 2641798 raft_server.cpp:692] Term: 12, pending_queue: 0, last_index: 4273, committed: 4273, known_applied: 4273, applying: 0, pending_writes: 0, queued_writes: 1447, local_sequence: 1294612
I20250915 12:40:23.183512 2641891 raft_server.h:60] Peer refresh succeeded!
E20250915 12:40:31.222455 2641798 raft_server.cpp:771] 1447 queued writes > healthy read lag of 1000
E20250915 12:40:31.222540 2641798 raft_server.cpp:783] 1447 queued writes > healthy write lag of 500
I20250915 12:40:33.234589 2641798 raft_server.cpp:692] Term: 12, pending_queue: 0, last_index: 4273, committed: 4273, known_applied: 4273, applying: 0, pending_writes: 0, queued_writes: 1447, local_sequence: 1294612
I20250915 12:40:33.234967 2641891 raft_server.h:60] Peer refresh succeeded!
The "queued writes" number is not decreasing.
Each time I restart the server I get a different number of queued_writes. Just now, for example, the logs are showing me:
E20250915 12:48:32.260331 2656619 raft_server.cpp:771] 1603 queued writes > healthy read lag of 1000
E20250915 12:48:32.260416 2656619 raft_server.cpp:783] 1603 queued writes > healthy write lag of 500
I20250915 12:48:33.265466 2656619 raft_server.cpp:692] Term: 13, pending_queue: 0, last_index: 4274, committed: 4274, known_applied: 4274, applying: 0, pending_writes: 0, queued_writes: 1603, local_sequence: 1286521
I20250915 12:48:33.265599 2656710 raft_server.h:60] Peer refresh succeeded!
E20250915 12:48:41.304512 2656619 raft_server.cpp:771] 1603 queued writes > healthy read lag of 1000
E20250915 12:48:41.304574 2656619 raft_server.cpp:783] 1603 queued writes > healthy write lag of 500
I20250915 12:48:43.314658 2656619 raft_server.cpp:692] Term: 13, pending_queue: 0, last_index: 4274, committed: 4274, known_applied: 4274, applying: 0, pending_writes: 0, queued_writes: 1603, local_sequence: 1286521
Restarted it again and this time it's 1569:
E20250915 12:53:17.801573 2668546 raft_server.cpp:771] 1569 queued writes > healthy read lag of 1000
E20250915 12:53:17.801613 2668546 raft_server.cpp:783] 1569 queued writes > healthy write lag of 500
I20250915 12:53:26.843848 2668546 raft_server.cpp:692] Term: 14, pending_queue: 0, last_index: 4275, committed: 4275, known_applied: 4275, applying: 0, pending_writes: 0, queued_writes: 1569, local_sequence: 1288065
E20250915 12:53:26.844029 2668546 raft_server.cpp:771] 1569 queued writes > healthy read lag of 1000
E20250915 12:53:26.844046 2668546 raft_server.cpp:783] 1569 queued writes > healthy write lag of 500
I20250915 12:53:26.844117 2668633 raft_server.h:60] Peer refresh succeeded!
In any event, at this point the server can't be used. Any attempts to do anything get a "not ready or lagging" messagin.
✗ curl '<http://localhost:8108/collections>' -H 'Content-Type: application/json' -H 'X-TYPESENSE-API-KEY: my_secret_key'
{ "message": "Not Ready or Lagging"}%
At this point I can't do anything other than delete all the server files and start again.
Which is obviously undesirable. So my question really is, when I get the server into this sort of situation by overloading it, what can I do to make it usable again?Alan Martini
09/15/2025, 5:43 PMAlan Buxton
09/20/2025, 10:50 PMtypesense_memory_active_bytes is about 600,000,000 so well under even 1Gb.
There is plenty of spare memory capacity.
I can't see in the docs how to set the memory limit that typesense should use. What can I do to allow typesense to use more memory?Alan Martini
09/22/2025, 4:17 PMAlan Buxton
09/22/2025, 7:35 PMbrew
it's typesense-server@29.0
Yes I am using embeddings generated from SentenceTransformers
Schema looks something like this:
'fields': [
{'name': 'uri', 'type': 'string'},
{'name': 'name', 'type': 'string[]'},
{'name': 'internal_id', 'type': 'int64'},
{'name': 'embedding', 'type': 'float[]', 'num_dim': 768, 'optional': True},
],
'default_sorting_field': 'internal_id'
I have about 3.9 million records I'm trying to load and it dies every time at a similar place around 350k documents.
Could it be something in a document that I'm trying to index that is breaking things? How best to troubleshoot?
My client app appears to hang when it was processing batch 8709 in this case (batch size is 40)
2025-09-22 19:29:09,534 - topics.management.commands.refresh_typesense - INFO - Processed 348240/3858329 documents
2025-09-22 19:29:09,536 - topics.services.typesense_service - INFO - {'system_disk_total_bytes': '994662584320', 'system_disk_used_bytes': '827832455168', 'system_memory_total_bytes': '25769803776', 'system_memory_used_bytes': '11712086016', 'typesense_memory_active_bytes': '681328640', 'typesense_memory_allocated_bytes': '606053400', 'typesense_memory_fragmentation_ratio': '0.11', 'typesense_memory_mapped_bytes': '866893824', 'typesense_memory_metadata_bytes': '19777728', 'typesense_memory_resident_bytes': '681328640', 'typesense_memory_retained_bytes': '0'}
2025-09-22 19:29:09,537 - topics.services.typesense_service - INFO - {'cache_hit_ratio': 0.0, 'delete_latency_ms': 0, 'delete_requests_per_second': 0, 'import_70Percentile_latency_ms': 12.0, 'import_95Percentile_latency_ms': 18.0, 'import_99Percentile_latency_ms': 24.0, 'import_latency_ms': 10.821052631578947, 'import_max_latency_ms': 24, 'import_min_latency_ms': 4, 'import_requests_per_second': 9.5, 'latency_ms': {'GET /metrics.json': 1.0, 'GET /stats.json': 0.0, 'POST /collections/organizations/documents/import': 10.821052631578947}, 'overloaded_requests_per_second': 0, 'pending_write_batches': 0, 'requests_per_second': {'GET /metrics.json': 9.5, 'GET /stats.json': 9.5, 'POST /collections/organizations/documents/import': 9.5}, 'search_latency_ms': 0, 'search_requests_per_second': 0, 'total_requests_per_second': 28.5, 'write_latency_ms': 0, 'write_requests_per_second': 0}
2025-09-22 19:29:09,579 - topics.management.commands.refresh_typesense - INFO - Processing batch 8707...
2025-09-22 19:29:09,702 - topics.management.commands.refresh_typesense - INFO - Processed 348280/3858329 documents
2025-09-22 19:29:09,704 - topics.services.typesense_service - INFO - {'system_disk_total_bytes': '994662584320', 'system_disk_used_bytes': '827976687616', 'system_memory_total_bytes': '25769803776', 'system_memory_used_bytes': '11712364544', 'typesense_memory_active_bytes': '678281216', 'typesense_memory_allocated_bytes': '603128672', 'typesense_memory_fragmentation_ratio': '0.11', 'typesense_memory_mapped_bytes': '866893824', 'typesense_memory_metadata_bytes': '19777728', 'typesense_memory_resident_bytes': '678281216', 'typesense_memory_retained_bytes': '0'}
2025-09-22 19:29:09,705 - topics.services.typesense_service - INFO - {'cache_hit_ratio': 0.0, 'delete_latency_ms': 0, 'delete_requests_per_second': 0, 'import_70Percentile_latency_ms': 12.0, 'import_95Percentile_latency_ms': 18.0, 'import_99Percentile_latency_ms': 24.0, 'import_latency_ms': 10.821052631578947, 'import_max_latency_ms': 24, 'import_min_latency_ms': 4, 'import_requests_per_second': 9.5, 'latency_ms': {'GET /metrics.json': 1.0, 'GET /stats.json': 0.0, 'POST /collections/organizations/documents/import': 10.821052631578947}, 'overloaded_requests_per_second': 0, 'pending_write_batches': 0, 'requests_per_second': {'GET /metrics.json': 9.5, 'GET /stats.json': 9.5, 'POST /collections/organizations/documents/import': 9.5}, 'search_latency_ms': 0, 'search_requests_per_second': 0, 'total_requests_per_second': 28.5, 'write_latency_ms': 0, 'write_requests_per_second': 0}
2025-09-22 19:29:09,750 - topics.management.commands.refresh_typesense - INFO - Processing batch 8708...
2025-09-22 19:29:09,835 - topics.management.commands.refresh_typesense - INFO - Processed 348320/3858329 documents
2025-09-22 19:29:09,837 - topics.services.typesense_service - INFO - {'system_disk_total_bytes': '994662584320', 'system_disk_used_bytes': '828202520576', 'system_memory_total_bytes': '25769803776', 'system_memory_used_bytes': '11712528384', 'typesense_memory_active_bytes': '679460864', 'typesense_memory_allocated_bytes': '604075640', 'typesense_memory_fragmentation_ratio': '0.11', 'typesense_memory_mapped_bytes': '866893824', 'typesense_memory_metadata_bytes': '19777728', 'typesense_memory_resident_bytes': '679460864', 'typesense_memory_retained_bytes': '0'}
2025-09-22 19:29:09,838 - topics.services.typesense_service - INFO - {'cache_hit_ratio': 0.0, 'delete_latency_ms': 0, 'delete_requests_per_second': 0, 'import_70Percentile_latency_ms': 12.0, 'import_95Percentile_latency_ms': 18.0, 'import_99Percentile_latency_ms': 24.0, 'import_latency_ms': 10.821052631578947, 'import_max_latency_ms': 24, 'import_min_latency_ms': 4, 'import_requests_per_second': 9.5, 'latency_ms': {'GET /metrics.json': 1.0, 'GET /stats.json': 0.0, 'POST /collections/organizations/documents/import': 10.821052631578947}, 'overloaded_requests_per_second': 0, 'pending_write_batches': 0, 'requests_per_second': {'GET /metrics.json': 9.5, 'GET /stats.json': 9.5, 'POST /collections/organizations/documents/import': 9.5}, 'search_latency_ms': 0, 'search_requests_per_second': 0, 'total_requests_per_second': 28.5, 'write_latency_ms': 0, 'write_requests_per_second': 0}
2025-09-22 19:29:09,897 - topics.management.commands.refresh_typesense - INFO - Processing batch 8709...
And in the typesense server logs it seems to act normal and then the dreaded "fail to read" messages appear and I have to terminate the server.
I20250922 20:31:04.126479 96134 raft_server.h:60] Peer refresh succeeded!
I20250922 20:31:14.172351 96043 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 8730, committed: 8730, known_applied: 8730, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 1338378
I20250922 20:31:14.172611 96134 raft_server.h:60] Peer refresh succeeded!
I20250922 20:31:21.564889 96044 batched_indexer.cpp:432] Running GC for aborted requests, req map size: 0, reference_q.size: 0
I20250922 20:31:24.210670 96043 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 8730, committed: 8730, known_applied: 8730, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 1338378
I20250922 20:31:24.210937 96134 raft_server.h:60] Peer refresh succeeded!
I20250922 20:31:34.259162 96043 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 8730, committed: 8730, known_applied: 8730, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 1338378
I20250922 20:31:34.259389 96134 raft_server.h:60] Peer refresh succeeded!
I20250922 20:31:44.301548 96043 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 8730, committed: 8730, known_applied: 8730, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 1338378
I20250922 20:31:44.301795 96134 raft_server.h:60] Peer refresh succeeded!
I20250922 20:31:54.343938 96043 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 8730, committed: 8730, known_applied: 8730, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 1338378
I20250922 20:31:54.344185 96134 raft_server.h:60] Peer refresh succeeded!
I20250922 20:32:04.387189 96043 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 8730, committed: 8730, known_applied: 8730, applying: 0, pending_writes: 0, queued_writes: 3, local_sequence: 1338378
I20250922 20:32:04.387455 96134 raft_server.h:60] Peer refresh succeeded!
E20250922 20:32:09.988420 95738 default_variables.cpp:109] Fail to read stat
E20250922 20:32:09.988538 95738 default_variables.cpp:232] Fail to read memory state
E20250922 20:32:09.988580 95738 default_variables.cpp:294] Fail to read loadavg
E20250922 20:32:10.994004 95738 default_variables.cpp:109] Fail to read stat
E20250922 20:32:10.994156 95738 default_variables.cpp:232] Fail to read memory state
E20250922 20:32:10.994225 95738 default_variables.cpp:294] Fail to read loadavg
E20250922 20:32:11.999470 95738 default_variables.cpp:109] Fail to read stat
E20250922 20:32:11.999624 95738 default_variables.cpp:232] Fail to read memory state
E20250922 20:32:11.999691 95738 default_variables.cpp:294] Fail to read loadavg
E20250922 20:32:13.004289 95738 default_variables.cpp:109] Fail to read stat
Thanks so much for your continued investigations into thisAlan Martini
09/23/2025, 4:48 PMAlan Buxton
09/24/2025, 3:30 PMAlan Buxton
09/28/2025, 2:08 PMbrew on my M4 Mac.
After a time the typesense server shows these errors:
I20250928 14:48:39.631742 126404 raft_server.h:60] Peer refresh succeeded!
I20250928 14:48:49.665649 126306 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 3888, committed: 3888, known_applied: 3888, applying: 0, pending_writes: 0, queued_writes: 5, local_sequence: 326000
I20250928 14:48:49.665894 126397 raft_server.h:60] Peer refresh succeeded!
I20250928 14:48:59.711536 126306 raft_server.cpp:692] Term: 2, pending_queue: 0, last_index: 3888, committed: 3888, known_applied: 3888, applying: 0, pending_writes: 0, queued_writes: 5, local_sequence: 326000
I20250928 14:48:59.711771 126404 raft_server.h:60] Peer refresh succeeded!
E20250928 14:49:03.139144 125995 default_variables.cpp:109] Fail to read stat
E20250928 14:49:03.139271 125995 default_variables.cpp:232] Fail to read memory state
E20250928 14:49:03.139314 125995 default_variables.cpp:294] Fail to read loadavg
E20250928 14:49:04.144349 125995 default_variables.cpp:109] Fail to read stat
E20250928 14:49:04.144511 125995 default_variables.cpp:232] Fail to read memory state
E20250928 14:49:04.144579 125995 default_variables.cpp:294] Fail to read loadavg
E20250928 14:49:05.149319 125995 default_variables.cpp:109] Fail to read stat
E20250928 14:49:05.149487 125995 default_variables.cpp:232] Fail to read memory state
typesense-memory.png shows memory stats. The x axis is batches. I'm using 40 source objects from the database per patch. Each object might generate zero or more typesense docs, but overall it's roughly a 1 to 1 relationship.Alan Buxton
09/28/2025, 2:10 PMAlan Martini
09/29/2025, 5:13 PMAlan Buxton
10/02/2025, 8:28 PMbrew?Alan Martini
10/02/2025, 9:29 PMAlan Buxton
10/07/2025, 6:53 AMAlan Martini
10/07/2025, 1:12 PMAlan Buxton
10/08/2025, 8:56 AMAlan Martini
10/08/2025, 12:18 PM