Hi I made a schema change to add embeddings CPU is now hitti typesense #community-help

Hi - I made a schema change to add embeddings. CPU...

Denny Vuong

09/16/2025, 7:36 AM

Hi - I made a schema change to add embeddings. CPU is now hitting 100%. 1. Will this go back to normal after the operation is done? 2. Do I need to do an auto upgrade? If so, will I be able to downgrade afterwards?

Kishore Nallan

09/16/2025, 7:48 AM

CPU will revert back to normal baseline once all the documents have been embedded.

Denny Vuong

09/16/2025, 12:25 PM

Ok thanks

Denny Vuong

09/16/2025, 9:57 PM

The instance is now unhealthy. Should I upgrade?

Alan Martini

09/16/2025, 10:41 PM

Hi Denny, we received the alert on it, Im checking it now.

Alan Martini

09/16/2025, 10:43 PM

Yeah, RAM linear increased as it created embeddeds and indexed them, then it crashed due to OOM. To calculate the amount necessary for your embeddings, you'll want to use 7 bytes x num_of_documents x num_of_dimensions

Denny Vuong

09/16/2025, 10:45 PM

What is the estimated cost for the new instance?

Denny Vuong

09/16/2025, 10:46 PM

Will the embeddings operation continue after u upgrade?

Alan Martini

09/16/2025, 10:47 PM

You can use the calculator here, but it will also show in the Modify page (Cluster Configuration -> Modify). https://cloud.typesense.org/pricing/calculator Do you know how many documents and the amount of dimensions is the model you're using? Is a good thing to calculate beforehand. If the number is something like 600gb, might be unfeasible.

Denny Vuong

09/16/2025, 10:47 PM

72M docs currently, no idea on dimensions, I just copied what was in the docs. Ts mini model or something?

Alan Martini

09/16/2025, 10:48 PM

It has 512 dimensions IIRC, will check again.

Denny Vuong

09/16/2025, 10:53 PM

Thanks

Alan Martini

09/16/2025, 10:53 PM

384. *72 M * 7 bytes * 384 = 193.5 GB. You'd need the 256 GB cluster (which has a min. of 8 vCPUs) that has a rate of* $4.41 /hr ($3,175.20 /month)

Alan Martini

09/16/2025, 10:55 PM

If you're just testing, a better approach will be to create a few hundred thousand documents collection, and use it on them. And then maybe perform some filtering on the 72M collection to use embedding on less documents

Denny Vuong

09/16/2025, 10:56 PM

Hrmm… I wanted to test to see if having an embedded model can produce better results as I’m having issues with the search algorithm right now

Denny Vuong

09/16/2025, 10:56 PM

Is it possible to reverse the embedded change?

Alan Martini

09/16/2025, 10:57 PM

I will have to check that. If you have an script that populates it anew, I can just delete the datadir on it. Do you have?

Denny Vuong

09/16/2025, 10:58 PM

I just added the embedded field which populates from “brand_name” and “name” directly into the schema settings from the Typesense UI

Denny Vuong

09/16/2025, 11:00 PM

On a side note, good to know how much this will cost. Are the conversation models as fast as regular search? The problem im having is that when I search for “Apple AirPods Pro 3”, I’m unable to return products that are named “AirPods pro 3”. Even after using drop_tokens_threshold=100

Alan Martini

09/16/2025, 11:04 PM

The Natural language search is much lighter, since it will use the LLM just to transform natural language into a query. The RAG will still need embeddings to be created/generated and stored.

Alan Martini

09/16/2025, 11:05 PM

But your problem seem like it should be handled by fine tuning the parameters on the search. We have other parameters aside from drop_tokens_threshold (we can open another ticket for it) that you can try

Denny Vuong

09/16/2025, 11:07 PM

Oh ok, that would be great

Denny Vuong

09/16/2025, 11:08 PM

Are we able to reverse the change so the instance can be healthy again?

Denny Vuong

09/16/2025, 11:08 PM

Sorry!

Alan Martini

09/16/2025, 11:12 PM

On your cluster state: it restarted and seem to be finishing reindexing the data. but it just started from the last command, which was the altering command. What I know will work is letting it finish, then altering the collection schema to remove the embeddings. But it will need the RAM for that. Otherwise, it will just loop into OOM, restart, try to process, OOM. I'm trying some other stuff here

Denny Vuong

09/16/2025, 11:16 PM

Ok no problem. Happy to also upgrade temporarily to get it to finish as well. However from memory someone told me once I upgrade I can’t downgrade

Alan Martini

09/16/2025, 11:18 PM

Hum, no no problem downgrading. You just have to make sure the new cluster has enough RAM for your collection (so deleting data before downgrading). If necessary to disable SDN and High Availability options, you would have to clone the cluster, which would generate a new hostname, but for RAM and vCPUs you can freely change it as you want

Alan Martini

09/16/2025, 11:19 PM

Ok, good to know. I have tried creating a new infrastructure for your cluster (with same configurations) and hopefuly it building from snapshot will not continue the alter command. But we'll now need to wait it reindex and see. If it doesn't work, then we try upgrading.

Denny Vuong

09/16/2025, 11:19 PM

Ok thanks!

Alan Martini

09/17/2025, 1:01 AM

@Denny Vuong unfortunately, it just loaded the e5-small model, which means it's creating the embeddings, so I will go ahead with the upgrade

Denny Vuong

09/17/2025, 1:01 AM

Ok no problem

🙌 1

Denny Vuong

09/17/2025, 1:36 PM

@Alan Martini Where can I raise a support ticket for my "apple airpods" issue?

Alan Martini

09/17/2025, 1:38 PM

@Denny Vuong you can open another thread here, an GitHub issue in Typesense/Typesense or send an email to support@typesense.org (if data privacy is necessary).

Denny Vuong

09/17/2025, 1:57 PM

btw, is the server now upgraded, just waiting for everything to finish then we can remove the embedding? I can see its still at close to 100% cpu usage

Denny Vuong

09/17/2025, 2:49 PM

Let’s not remove embedding after it’s done, I want to test it out first.

Alan Martini

09/17/2025, 3:09 PM

@Denny Vuong, yeah, but seems to be altering yet:

Copy code

2025-09-17 15:08:25.000 UTC I Altered 13172736 so far.

Alan Martini

09/17/2025, 3:09 PM

72M is a really big number to generate embeddings huh

Denny Vuong

09/17/2025, 3:16 PM

A few more days I guess. Is this a common use case for what I need? I’m only expecting to have max 200M products long term I think

Kishore Nallan

09/17/2025, 3:20 PM

Alters tend to run slowly to prevent disruption to ongoing search. But in this case with embeddings it's very CPU intensive so even then it's maxing out the CPU.

Denny Vuong

09/17/2025, 3:33 PM

Can we upgrade again to speed it up?

Kishore Nallan

09/17/2025, 3:59 PM

Alters are throttled by nature. A larger instance might not help also. So sorry about this, this is not something we thought through. In fact we didn't allow alter for embedding fields for a long time.

Kishore Nallan

09/17/2025, 4:00 PM

What embedding model are you using?

Denny Vuong

09/17/2025, 4:04 PM

All good!

Denny Vuong

09/17/2025, 4:04 PM

I followed the documentation so I think it was ts/all-MiniLM-L12-v2

Alan Martini

09/17/2025, 4:15 PM

It was e5-small, with 384 dimensions.

Alan Martini

09/17/2025, 4:16 PM

Good thing your cluster is still answering requests though

Denny Vuong

09/17/2025, 4:42 PM

Do you have any others using e5-small for product searches and are the results accurate? I’m assuming if I had to change embedding models then it’ll take another few days to alter?

Kishore Nallan

09/17/2025, 4:44 PM

E5 small is a good all round model. We do have people using it for e-commerce.

Kishore Nallan

09/17/2025, 4:45 PM

I think it might be easier for you to generate the embeddings externally and ingest into typesense. You can still use the embedding field for querying. During indexing we will skip triggering the embedding if a field already has vector values.

Kishore Nallan

09/17/2025, 4:46 PM

Easiest course of action I see is for you to do this on another node and shut this one down.

Kishore Nallan

09/17/2025, 4:47 PM

If you generate outside you have an option to use GPUs so the entire process will be very fast.

Denny Vuong

09/17/2025, 4:48 PM

Once this finishes, it should be ok though right? I don’t expect too many new products being added in the short term

Kishore Nallan

09/17/2025, 5:07 PM

Yeah once it's done it is fine

Alan Martini

09/17/2025, 8:57 PM

Hey Denny, just doing a eod update here:

Copy code

2025-09-17 20:56:22.000 UTC I Altered 19562496 so far.

Denny Vuong

09/17/2025, 8:58 PM

Thanks for the update!

Denny Vuong

09/20/2025, 4:28 PM

@Kishore Nallan @Alan Martini - The update is now done. I'm just testing it but the results are terrible. None of the the returned products are even airpods. Is this the correct way to use it?

Copy code

curl '<https://xxxxxxx-1.a1.typesense.net/multi_search?q=apple+airpods+pro+3&conversation=false&conversation_model_id=conv-model-1>' \
        -X POST \
        -H "Content-Type: application/json" \
        -H "X-TYPESENSE-API-KEY: xxxxxxxxxxxx" \
        -d '{
              "searches": [
                {
                  "collection": "products",
                  "query_by": "embedding",
                  "exclude_fields": "embedding"
                }
              ]
            }'

Kishore Nallan

09/20/2025, 4:33 PM

There is no need to use a conversation model here. Can you try without it?

Denny Vuong

09/20/2025, 4:35 PM

Screenshot 2025-09-20 at 9.31.49 am.png,Screenshot 2025-09-20 at 9.31.57 am.png

Denny Vuong

09/20/2025, 4:36 PM

first result is an iphone

Kishore Nallan

09/20/2025, 4:44 PM

Are you expecting the query to march exactly with same version number? That's not how embedding based vector searching works. They give only results close in semantic space so are used to handle synonyms and words that mean the same (e.g. sneakers vs shoes). You are not going to get exact e-commerce based matches based on model / attribute keywords because these models are trained only on semantic similarities.

Denny Vuong

09/20/2025, 5:03 PM

Hrmm I mean was hoping for at least matching airpods. I think doing hybrid searching with name and embedding then reranking works

5 Views

Open in Slack

Previous Next