Hi I am running typesense on an EKS cluster with 3 nodes of typesense #community-help

Hi, I am running typesense on an EKS cluster with...

Manav Kothari

11/01/2024, 2:24 PM

Hi, I am running typesense on an EKS cluster with 3 nodes of r7g.xlarge (4vcpu, 32gb ram) instance I have a total of 60 million records and index on nearly 12 fields (occupying 28gb out of 32 GB) out of 14 fields when I perform a single filter query on the field it takes more than 5 seconds.., how can I optimize this?. Just so you know, there is no production traffic yet.

Jason Bosco

11/01/2024, 2:39 PM

Could you share a curl request to Typesense, showing all the search parameters?

Manav Kothari

11/01/2024, 2:41 PM

Copy code

curl '{{URL}}/multi_search?x-typesense-api-key={{KEY}}' \
  -H 'accept: application/json, text/plain, */*' \
  -H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8' \
  -H 'content-type: text/plain' \
  -H 'sec-ch-ua: "Chromium";v="124", "Google Chrome";v="124", "Not-A.Brand";v="99"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Linux"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: same-site' \
  -H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36' \
  --data-raw '{"searches":[{"exhaustive_search":true,"query_by":"country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state","highlight_full_fields":"country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state","collection":"people","q":"*","facet_by":"city,country,country_region,seniority,state,title","filter_by":"country:=[`India`]","max_facet_values":10,"page":1,"per_page":12},{"exhaustive_search":true,"query_by":"country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state","highlight_full_fields":"country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state","collection":"people","q":"*","facet_by":"country","max_facet_values":10,"page":1}]}'

Manav Kothari

11/01/2024, 2:43 PM

Copy code

[
  {
    "exhaustive_search": true,
    "query_by": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
    "highlight_full_fields": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
    "collection": "people",
    "q": "*",
    "facet_by": "city,country,country_region,seniority,state,title",
    "filter_by": "country:=[`India`]",
    "max_facet_values": 10,
    "page": 1,
    "per_page": 12
  },
  {
    "exhaustive_search": true,
    "query_by": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
    "highlight_full_fields": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
    "collection": "people",
    "q": "*",
    "facet_by": "country",
    "max_facet_values": 10,
    "page": 1
  }
]

Jason Bosco

11/01/2024, 2:51 PM

It's most likely the exhaustive_search flag. Could you remove that? And also change

country:=[India]

country:[India]

(remove the equals - if you search the docs for exact vs non-exact match, you'll see what it does)

Jason Bosco

11/01/2024, 2:52 PM

Could you also add

facet_sample_threshold: 10000

and

facet_sample_percent: 20

as additional parameters?

Manav Kothari

11/01/2024, 3:18 PM

okay let me try.

Manav Kothari

11/01/2024, 3:25 PM

Copy code

{
  "searches": [
    {
      "query_by": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
      "highlight_full_fields": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
      "collection": "people",
      "q": "*",
      "facet_by": "city,country,country_region,seniority,state,title",
      "filter_by": "country:[`India`]",
      "max_facet_values": 10,
      "facet_sample_threshold": 10000,
      "facet_sample_percent": 20,
      "page": 1,
      "per_page": 12
    },
    {
      "query_by": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
      "highlight_full_fields": "country,country_region,email,first_name,name,seniority,job_start_date,title,last_name,linkedin_url,organization_id,city,state",
      "collection": "people",
      "q": "*",
      "facet_by": "country",
      "facet_sample_threshold": 10000,
      "facet_sample_percent": 20,
      "max_facet_values": 10,
      "page": 1
    }
  ]
}

Still takes 4.3 sec, let me know if the values are any wrong here.

Manav Kothari

11/01/2024, 6:50 PM

Do i require more vcpu for given query pattern?

Jason Bosco

11/01/2024, 9:21 PM

Yeah more CPUs and higher clock speed CPUs will help. But besides that it's hard to debug further without having access to the dataset. And we only offer this type of performance tuning help on Typesense Cloud.

Manav Kothari

11/02/2024, 5:48 AM

I understand, but with no production traffic it should give better latency right. and this is not even a complex query. if you could share some performace optimization tips or some kinda documents that would be helpful.

Jason Bosco

11/02/2024, 2:41 PM

The tips I shared above are the generic ones. Besides that optimizations become specific to the dataset, hardware configuration etc. This is why we only offer this on Typesense Cloud where we have complete visibility on runtime characteristics.

👍 1

Open in Slack

Previous Next