#community-help

Optimizing Schema and Reducing Memory Usage

TLDR Shaun faced issues with memory usage when adding a new field to a schema. Kishore Nallan advised optimizing the schema by disabling unnecessary facets and sorting options to reduce memory usage.

Powered by Struct AI
Jun 20, 2023 (5 months ago)
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:27 AM
when trying to add a new field on a schema
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:28 AM
The Typesense server instance is close to running out of system memory.
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:33 AM
mm
07:33
Shaun
07:33 AM
i dont get it
07:33
Shaun
07:33 AM
I was able to make the same change on the development cluster
07:34
Shaun
07:34 AM
which has the same docs in it
07:34
Shaun
07:34 AM
and the same collections/ memory etc
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:34 AM
DM me the cluster ID
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:35 AM
Done
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:38 AM
The prod cluster is using 261M of RAM and 192M of swap space to a total of 452M which is very close to total available memory. So writes are throttled. Dev is at similar memory usage but the actual usage can differ slightly instance to instance based on cache used during searching etc.
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:40 AM
gotcha
07:41
Shaun
07:41 AM
as I think they would basically be using the same memory for actual docs
07:41
Shaun
07:41 AM
as they have the same data in them
07:41
Shaun
07:41 AM
Is there any thing to optimise /change or if we wait the swap will decrease when reclaiming memory?
07:41
Shaun
07:41 AM
all I want to do is add another field
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:41 AM
No you have to either upsize or delete some documents / fields to make room.
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:42 AM
ok got it
07:42
Shaun
07:42 AM
question, for example the field we adding, userId if we just want to do exact match on that ID being found, we would need to have facet:true on it? Or facet:false will still be ok ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:44 AM
You want to use it during filtering?
07:44
Kishore Nallan
07:44 AM
You don't need to enable facet for that
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:45 AM
ok got it
07:45
Shaun
07:45 AM
so I think we could optimise the current schemas then
07:45
Shaun
07:45 AM
we have currently for each collection, facet:true for like verified and deleted flags
07:46
Shaun
07:46 AM
Would these just be fine with facet:false
07:46
Shaun
07:46 AM
they are just bool fields
07:46
Shaun
07:46 AM
which we would filter such as deleted=false etc
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:49 AM
Yeah no need facet for that
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:49 AM
ah ok nice
07:49
Shaun
07:49 AM
so that should decrease the mem usage making those changes I would expect ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:50 AM
Yes
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:50 AM
nice
07:50
Shaun
07:50 AM
Will make those changes and see
07:50
Shaun
07:50 AM
otherwise will increase memory size
07:50
Shaun
07:50 AM
of cluster
07:55
Shaun
07:55 AM
Even when updated to

    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "deleted",
      "optional": false,
      "sort": false,
      "type": "bool"
    }

07:56
Shaun
07:56 AM
They still have index:true
07:56
Shaun
07:56 AM
is that fine ?
07:56
Shaun
07:56 AM
/ expected
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
07:56 AM
That's fine
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
07:56 AM
ok cool
07:57
Shaun
07:57 AM
thanks
08:03
Shaun
08:03 AM
Does sort:true make much difference to memory usage? I take it that if this is false you cant sort by that field, except it seems to make it true by default for all fields
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:04 AM
Usually with numerical fields people want to be able to sort so we enable it by default.
08:04
Kishore Nallan
08:04 AM
But you can disable if you don't need it.
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
08:04 AM
got it
08:04
Shaun
08:04 AM
I guess bool is just treated as numeric under the hood
08:04
Shaun
08:04 AM
as 0/1
08:04
Shaun
08:04 AM
as those all get it by default
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:04 AM
Yes
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
08:05 AM
it wont make much diff on memory for bool fields? so safe to just leave it on as default ?
08:07
Shaun
08:07 AM
Or for other int32 fields for example
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
08:19 AM
Don't enable sorting if it's not needed. Every type of index does use memory.
Shaun
Photo of md5-9dd01dbbef7bac5e85a472a52dc35647
Shaun
08:19 AM
Ok got it
08:19
Shaun
08:19 AM
makes sense
08:19
Shaun
08:19 AM
thanks
Jun 21, 2023 (5 months ago)

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Discussions on Typesense, Collections, and Dynamic Fields

Tugay shares plans to use Typesense for their SaaS platform and asks about collection sizes and sharding. Jason clarifies Typesense's capabilities and shares a beta feature. They discuss using unique collections per customer and new improvements. Kishore Nallan and Gabe comment on threading and data protection respectively.

3

45
35mo

Adding New Fields to Items in Typesense Without Wiping Data

Alex wanted to add more fields to their Typesense items without wiping data. Kishore Nallan explained they had to create a new collection instead, and suggested using automatic schema detection for future changes. They also discussed potential downsides of indexing every field.

1

7
32mo

Handling Kinesis Stream Event Batching with Typesense

Dui had questions about how to handle Kinesis stream events with Typesense. Kishore Nallan suggested using upsert mode for creation/update and differentiating with logical deletion. After various discussions including identifying and resolving a bug, they finalized to introduce an `emplace` action in Typesense v0.23.

8

91
24mo

Issue with Embedding Error in Version 0.25.0.rc63

Bill reported a bug in version 0.25.0.rc63 regarding a problem with updating or emplacing a document and receiving an embedding error. This was resolved in version 0.25.0.rc65, but further discussion ensued regarding the function of 'index' in the update feature.

5

63
4mo

Utilizing Vector Search and Word Embeddings for Comprehensive Search in Typesense

Bill sought clarification on using vector search with multiple word embeddings in Typesense and using them instead of OpenAI's embedding. Kishore Nallan and Jason informed him that their development version 0.25 supports open source embedding models. They also resolved Bill's concerns regarding search performance, language support, and limitations in the search parameters.

11

225
4mo