Hi general question, what causes CPU usage to be o...
# community-help
a
Hi general question, what causes CPU usage to be over 75%? We are seeing no slowdowns in our requests but the cloud dashboard is reporting spikes in usage. We are on 4GB 2vCPUs, 4hr burst with HA enabled (3 nodes) The only large background job we have is every hour updating our full collection/catalogue (interim solution, looking into deltas in future) but this doesnt seem to be the direct cause. At 1am this morning we got auto upgraded to the 8GB 4vCPU tier, but have no reason to be on it according to the docs so have dropped it back down and disabled autoscale. We have on average around 500 users per hour using the search engine for our product catalogue (retail store). We have a about 30 filter facets used and run collection search queries with some caching via the SDK (300 seconds / 5 mins). We have max_facet_values set to 500 (mostly to accomodate our list of brands). We also have max_typos set to 2 if this helps. Anything we are doing that would cause the high CPU load on our cluster?
f
Faceting is a heavy computational operation. Even with 500 of max facet values, you still operate heavily on the CPU. Can you share some of the queries that are causing this spike?
a
Is there a way of knowing what usage the query is causing high usage? Here's a log of the search params for one of our collections:
Copy code
{
    q: '*',
    query_by: 'name,brand,size,description',
    filter_by: 'gender:=["Mens"] && nav:=["footwear"]',
    sort_by: 'score:desc',
    page: 1,
    per_page: 80,
    facet_by: 'currentPrice,category,gender,brand,colour,size,waist,insideLeg,rise,hem,chest,pitToPit,ukSize,euSize,usSize,sleeveLength,neckline,pattern,knitStyle,collar,pockets,fit,fabricType,fabricWash,material,type,style,length,armpitToArmpit,armpitToCuff,collarToHem,grade',
    num_typos: 2,
    exhaustive_search: undefined,
    max_facet_values: 500
}
f
We can only see the CPU usage, not which request specifically is causing it, unless it's something that takes long to execute (does it take long to execute?)
a
No response times are super fast which is what was confusing us why the dashboard is reporting high usage but no slowdowns at all
f
Is the 75% usage consistent? Or did it only spike for a brief moment?
a
Our P95 graph shows this, ignore after 10am this is when we triggered the cluster downgrade. If I switch to avg over the last 6 hours it seems pretty low
f
Can you give me your cluster's id?
a
Is that just the cluster name (the one used for the hostname)?
aeqyxbt7ns6djrp0p
f
It looks like your cluster has been hitting 100% or close to it consistently for the past 12h, with under 1 request / second
There are some spikes in requests / second, but CPU usage is still high regardless of them.
a
Doesnt seem right, especially during the night when our update job isnt running apart from at 3am UTC Unless we are being hit with bot traffic? We deployed this to production yesterday, perhaps we are being hit with indexers
k
exhaustive_search=true
could cause heavy query times. Can you try removing that?