#community-help

Debugging High CPU Usage in Typesense Server

TLDR Zsolti was facing high CPU usage in Typesense server in PHP. Mihai suggested using CloudFlare, while Harrison recommended a database for fetching all documents. Kishore Nallan provided information about cache limits and potential improvements.

Powered by Struct AI

4

Mar 26, 2022 (21 months ago)
Zsolti
Photo of md5-a9a351e11d64f05b41fec183816a0cda
Zsolti
05:26 PM
Hey. I'm running the latest typesense server with the php client since a few days. More than half of the queries are the same, so most of the queries could come from cache. Caching is enabled (in multisearch->perform()), but the CPU usage is suspiciusly high (8 vcpu and ~500k docs (this is a sum of all docs in multiple collections), 100reqs/sec). Latency is OK. Can you recommend me something to figure out the cache hit/miss ratio maybe? Or how could I debug this? For sure I could put memcache in front of typesense but I would avoid it if I can.
Mihai
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Mihai
08:07 PM
Hi Zsolti, debugging and figuring out why the CPU usage is so high is a must, but, if it turns out that this is a normal behavior I would recommend using CloudFlare in front of typesense especially if the change velocity of your quries/data is farily static.
Zsolti
Photo of md5-a9a351e11d64f05b41fec183816a0cda
Zsolti
08:24 PM
So basically I should not trust the use_cache feature ? :) I inserted a very simple memcache in front of the typesense calls, and it reduced the queries per second from 100 to 30, so there was something wrong with the typesense cache.
08:30
Zsolti
08:30 PM
Also, my another question is that I have a large amount of non-text queries like q="*" to generate filtered listpages. As I see these queries still consumes noticeable amount of CPU time, compared to the same feature implemented with MySQL queries. Is this assumption correct ? Are there any options to optimize these non-textual queries? Should I choose mysql for these type of queries ?
Harrison
Photo of md5-43a35158b04c9c49110114370dbeae06
Harrison
09:01 PM
if you're just fetching all documents, your database is probably going to beat typesense

1

09:01
Harrison
09:01 PM
just because the database's one job is handling your persistent storage and retrieving data. rather than searching, so it's got a pretty good chance of being able to fetch and serve docs faster if you're just unconditionally fetching

1

Zsolti
Photo of md5-a9a351e11d64f05b41fec183816a0cda
Zsolti
09:21 PM
Thank you.
Mar 27, 2022 (21 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:18 AM
Zsolti Typesense's query cache implementation is currently pretty conservation: we cache only 128 query responses and start evicting the oldest inserted element when the cache capacity grows past 128 entries. The default cache TTL is also only 60s. This might explain why the caching does not seem to have much impact. We plan to make the cache size configurable.

As for the q=* queries, Typesense is slower because we haven't yet optimized it for this use case. It does a couple of dumb things that the moment, which we again want to fix in future. But again this should be cacheable if the cache is large enough.

1

Mar 28, 2022 (21 months ago)
Mihai
Photo of md5-5a7e6fd9a070eac5034a6034f0dc38b1
Mihai
06:32 PM
Hi Zsolti, I haven't said that you don't have to rely on Typesense cache, but, it's better to use the right tools for the right job, as you have seen the memcache implementation did the job. And, Kishore Nallan explanation is proving my point.

1

Mar 29, 2022 (21 months ago)
Zsolti
Photo of md5-a9a351e11d64f05b41fec183816a0cda
Zsolti
01:10 PM
Thanks for the explanation !
01:12
Zsolti
01:12 PM
I'm looking forward to see the q=* optimizations in the future :)

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Issues and Improvements in Typesense with 14 Million Records

Miguel experienced performance issues when using Typesense for large datasets. Jason suggested performance improvements made to Typesense since then and directed them to specific server-side parameters for better handling. Miguel agreed to try again.

2

21
19mo

Fetching All Docs from a Collection in Typesense

Julian asked if all docs could be fetched from a Typesense collection, and Kishore Nallan explained there's a 250 result limit due to performance considerations. Andrew suggested using the export function, explaining their operations and performance.

19
15mo

Issue with Search Duration on Typesense Database

Robert reported an issue about query time delay when adding a `filter_by` constraint in a large Typesense database. Kishore Nallan explained that this happens due to the order of operation and also promised to look into this issue further. Robert withdrew his interest in sponsoring the improvement due to moving from the project.

13
10mo

Enhancing Vector Search Performance and Response Time using Multi-Search Feature

Bill faced performance issues with vector search using multi_search feature. Jason and Kishore Nallan suggested running models on a GPU and excluding large fields from the search. Through discussion, it was established that adding more CPUs and enabling server-side caching could enhance performance. The thread concluded with the user reaching a resolution.

3

140
1mo

Discussing Typesense Search Request Performance

Al experienced longer-than-reported times for Typesense search requests, sparking a detailed examination of json parsing, response times and data transfer. Jason and Kishore Nallan helped solve the issue.

2

37
33mo