wave Hi new Typesense user here I am building the search so typesense #community-help

:wave: Hi – new Typesense user here. I am building...

Ken Kunz

02/15/2022, 3:31 PM

👋 Hi – new Typesense user here. I am building the search solution for tradingstrategy.ai and have identified Typesense as the likely b/end. I have a question about how to best implement updates using JSONL import and aliases ----->

👀 1

Ken Kunz

02/15/2022, 3:31 PM

My plan is to follow the pattern described here: https://typesense.org/docs/0.22.2/api/collection-alias.html

Ken Kunz

02/15/2022, 3:32 PM

(populate new collection daily using

documents/import

, toggle collection alias to point to new collection)

Kishore Nallan

02/15/2022, 3:33 PM

👋 Very interesting site! The alias-based approach works well if your content refresh happens only periodically so you can just a full refresh of the index.

👀 1

Kishore Nallan

02/15/2022, 3:34 PM

For streaming or adhoc updates, you can use the

upsert

update

action of the import end-point.

Ken Kunz

02/15/2022, 3:35 PM

My question is… how do I know when the new collection is indexed and ready to use? Do I need to poll with

GET /collections/foo

or is there a callback that can notify me when the new collection is ready?

Ken Kunz

02/15/2022, 3:37 PM

My concern with using incremental updates – it adds some complexity (need to account for adds, updates, deletes) and you almost always get some “downstream drift” of your data… requiring periodic full updates anyway… so starting with full updates seems simpler.

Kishore Nallan

02/15/2022, 3:52 PM

If that works for your update frequency then certainly the easiest approach 👍

Ken Kunz

02/15/2022, 3:54 PM

Thanks. Any feedback on my question regarding how to know when new collection index is ready (when to toggle alias)?

Kishore Nallan

02/15/2022, 3:59 PM

When your import call ends, that indicates that the collection has been indexed.

Ken Kunz

02/15/2022, 4:03 PM

OK… so I can synchronously toggle the alias as soon as the import request is complete? I noticed previously when importing a large collection (1.6M records) that I was getting some type of “not ready” response to

GET collection/foo

requests as well as

search

requests.

Kishore Nallan

02/15/2022, 4:16 PM

When you import a large batch at one go, your writes can lag and this can trigger the max-read-lag and max-write-lag configuration values at which point the system will think it is lagging behind heavily and so will return not ready to prevent stale results from being returned. To prevent this from happening, split your imports into batches that are not too large. We have some work planned to make the import endpoint automatically slow down for large uploads which should make this easier.

Ken Kunz

02/15/2022, 4:35 PM

👍 thanks – that helps! Any suggestion on max batch size?

Kishore Nallan

02/16/2022, 3:31 AM

It will depend on how many fields you are indexing and whether the fields are large text etc.

Kishore Nallan

02/16/2022, 3:32 AM

I will recommend starting with about 2000-3000 documents per batch and then revising based on observations.

Ken Kunz

02/16/2022, 9:19 PM

👍 thanks – appreciate the support!

Open in Slack

Previous Next