Hi - I made a schema change to add embeddings. CPU...
# community-help
d
Hi - I made a schema change to add embeddings. CPU is now hitting 100%. 1. Will this go back to normal after the operation is done? 2. Do I need to do an auto upgrade? If so, will I be able to downgrade afterwards?
k
CPU will revert back to normal baseline once all the documents have been embedded.
d
Ok thanks
The instance is now unhealthy. Should I upgrade?
a
Hi Denny, we received the alert on it, Im checking it now.
Yeah, RAM linear increased as it created embeddeds and indexed them, then it crashed due to OOM. To calculate the amount necessary for your embeddings, you'll want to use 7 bytes x num_of_documents x num_of_dimensions
d
What is the estimated cost for the new instance?
Will the embeddings operation continue after u upgrade?
a
You can use the calculator here, but it will also show in the Modify page (Cluster Configuration -> Modify). https://cloud.typesense.org/pricing/calculator Do you know how many documents and the amount of dimensions is the model you're using? Is a good thing to calculate beforehand. If the number is something like 600gb, might be unfeasible.
d
72M docs currently, no idea on dimensions, I just copied what was in the docs. Ts mini model or something?
a
It has 512 dimensions IIRC, will check again.
d
Thanks
a
384. *72 M * 7 bytes * 384 = 193.5 GB. You'd need the 256 GB cluster (which has a min. of 8 vCPUs) that has a rate of* $4.41 /hr ($3,175.20 /month)
If you're just testing, a better approach will be to create a few hundred thousand documents collection, and use it on them. And then maybe perform some filtering on the 72M collection to use embedding on less documents
d
Hrmm… I wanted to test to see if having an embedded model can produce better results as I’m having issues with the search algorithm right now
Is it possible to reverse the embedded change?
a
I will have to check that. If you have an script that populates it anew, I can just delete the datadir on it. Do you have?
d
I just added the embedded field which populates from “brand_name” and “name” directly into the schema settings from the Typesense UI
On a side note, good to know how much this will cost. Are the conversation models as fast as regular search? The problem im having is that when I search for “Apple AirPods Pro 3”, I’m unable to return products that are named “AirPods pro 3”. Even after using drop_tokens_threshold=100
a
The Natural language search is much lighter, since it will use the LLM just to transform natural language into a query. The RAG will still need embeddings to be created/generated and stored.
But your problem seem like it should be handled by fine tuning the parameters on the search. We have other parameters aside from drop_tokens_threshold (we can open another ticket for it) that you can try
d
Oh ok, that would be great
Are we able to reverse the change so the instance can be healthy again?
Sorry!
a
On your cluster state: it restarted and seem to be finishing reindexing the data. but it just started from the last command, which was the altering command. What I know will work is letting it finish, then altering the collection schema to remove the embeddings. But it will need the RAM for that. Otherwise, it will just loop into OOM, restart, try to process, OOM. I'm trying some other stuff here
d
Ok no problem. Happy to also upgrade temporarily to get it to finish as well. However from memory someone told me once I upgrade I can’t downgrade
a
Hum, no no problem downgrading. You just have to make sure the new cluster has enough RAM for your collection (so deleting data before downgrading). If necessary to disable SDN and High Availability options, you would have to clone the cluster, which would generate a new hostname, but for RAM and vCPUs you can freely change it as you want
Ok, good to know. I have tried creating a new infrastructure for your cluster (with same configurations) and hopefuly it building from snapshot will not continue the alter command. But we'll now need to wait it reindex and see. If it doesn't work, then we try upgrading.
d
Ok thanks!
a
@Denny Vuong unfortunately, it just loaded the e5-small model, which means it's creating the embeddings, so I will go ahead with the upgrade
d
Ok no problem
🙌 1
@Alan Martini Where can I raise a support ticket for my "apple airpods" issue?
a
@Denny Vuong you can open another thread here, an GitHub issue in Typesense/Typesense or send an email to support@typesense.org (if data privacy is necessary).
d
btw, is the server now upgraded, just waiting for everything to finish then we can remove the embedding? I can see its still at close to 100% cpu usage
Let’s not remove embedding after it’s done, I want to test it out first.
a
@Denny Vuong, yeah, but seems to be altering yet:
Copy code
2025-09-17 15:08:25.000 UTC I Altered 13172736 so far.
72M is a really big number to generate embeddings huh
d
A few more days I guess. Is this a common use case for what I need? I’m only expecting to have max 200M products long term I think
k
Alters tend to run slowly to prevent disruption to ongoing search. But in this case with embeddings it's very CPU intensive so even then it's maxing out the CPU.
d
Can we upgrade again to speed it up?
k
Alters are throttled by nature. A larger instance might not help also. So sorry about this, this is not something we thought through. In fact we didn't allow alter for embedding fields for a long time.
What embedding model are you using?
d
All good!
I followed the documentation so I think it was ts/all-MiniLM-L12-v2
a
It was e5-small, with 384 dimensions.
Good thing your cluster is still answering requests though
d
Do you have any others using e5-small for product searches and are the results accurate? I’m assuming if I had to change embedding models then it’ll take another few days to alter?
k
E5 small is a good all round model. We do have people using it for e-commerce.
I think it might be easier for you to generate the embeddings externally and ingest into typesense. You can still use the embedding field for querying. During indexing we will skip triggering the embedding if a field already has vector values.
Easiest course of action I see is for you to do this on another node and shut this one down.
If you generate outside you have an option to use GPUs so the entire process will be very fast.
d
Once this finishes, it should be ok though right? I don’t expect too many new products being added in the short term
k
Yeah once it's done it is fine
a
Hey Denny, just doing a eod update here:
Copy code
2025-09-17 20:56:22.000 UTC I Altered 19562496 so far.
d
Thanks for the update!
@Kishore Nallan @Alan Martini - The update is now done. I'm just testing it but the results are terrible. None of the the returned products are even airpods. Is this the correct way to use it?
Copy code
curl '<https://xxxxxxx-1.a1.typesense.net/multi_search?q=apple+airpods+pro+3&conversation=false&conversation_model_id=conv-model-1>' \
        -X POST \
        -H "Content-Type: application/json" \
        -H "X-TYPESENSE-API-KEY: xxxxxxxxxxxx" \
        -d '{
              "searches": [
                {
                  "collection": "products",
                  "query_by": "embedding",
                  "exclude_fields": "embedding"
                }
              ]
            }'
k
There is no need to use a conversation model here. Can you try without it?
d
Screenshot 2025-09-20 at 9.31.49 am.png,Screenshot 2025-09-20 at 9.31.57 am.png
first result is an iphone
k
Are you expecting the query to march exactly with same version number? That's not how embedding based vector searching works. They give only results close in semantic space so are used to handle synonyms and words that mean the same (e.g. sneakers vs shoes). You are not going to get exact e-commerce based matches based on model / attribute keywords because these models are trained only on semantic similarities.
d
Hrmm I mean was hoping for at least matching airpods. I think doing hybrid searching with name and embedding then reranking works