:wave: Hi – new Typesense user here. I am building...
# community-help
k
👋 Hi – new Typesense user here. I am building the search solution for tradingstrategy.ai and have identified Typesense as the likely b/end. I have a question about how to best implement updates using JSONL import and aliases ----->
👀 1
My plan is to follow the pattern described here: https://typesense.org/docs/0.22.2/api/collection-alias.html
(populate new collection daily using
documents/import
, toggle collection alias to point to new collection)
k
👋 Very interesting site! The alias-based approach works well if your content refresh happens only periodically so you can just a full refresh of the index.
👀 1
For streaming or adhoc updates, you can use the
upsert
or
update
action of the import end-point.
k
My question is… how do I know when the new collection is indexed and ready to use? Do I need to poll with
GET /collections/foo
or is there a callback that can notify me when the new collection is ready?
My concern with using incremental updates – it adds some complexity (need to account for adds, updates, deletes) and you almost always get some “downstream drift” of your data… requiring periodic full updates anyway… so starting with full updates seems simpler.
k
If that works for your update frequency then certainly the easiest approach 👍
k
Thanks. Any feedback on my question regarding how to know when new collection index is ready (when to toggle alias)?
k
When your import call ends, that indicates that the collection has been indexed.
k
OK… so I can synchronously toggle the alias as soon as the import request is complete? I noticed previously when importing a large collection (1.6M records) that I was getting some type of “not ready” response to
GET collection/foo
requests as well as
search
requests.
k
When you import a large batch at one go, your writes can lag and this can trigger the max-read-lag and max-write-lag configuration values at which point the system will think it is lagging behind heavily and so will return not ready to prevent stale results from being returned. To prevent this from happening, split your imports into batches that are not too large. We have some work planned to make the import endpoint automatically slow down for large uploads which should make this easier.
k
👍 thanks – that helps! Any suggestion on max batch size?
k
It will depend on how many fields you are indexing and whether the fields are large text etc.
I will recommend starting with about 2000-3000 documents per batch and then revising based on observations.
k
👍 thanks – appreciate the support!