#community-help

Fetching All Docs from a Collection in Typesense

TLDR Julian asked if all docs could be fetched from a Typesense collection, and Kishore Nallan explained there's a 250 result limit due to performance considerations. Andrew suggested using the export function, explaining their operations and performance.

Powered by Struct AI
19
12mo
Solved
Join the chat
Sep 09, 2022 (12 months ago)
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
10:33 AM
Does anybody know if it's possible to fetch all docs (>1k) from a collection in one request? Or is 250 hits/page the request limit?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
10:36 AM
You can use a multi_search to paginate in a single request.
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
10:56 AM
Mhm, I see. This is a little bit of a pain when using with instantsearch.js though. Any plans to increase the limit of 250 results or any reasons why not to?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
10:58 AM
Primarily because Typesense is a search engine and optimized very much for fetching the "top" records for a given query. So while we can fetch greater than 250 results on pagination, that's going to become slower as you progress through. That limit exists to be a reminder of this limitation. We don't want people accidentally blowing up a search with a 1000 page fetch on a large dataset.
10:59
Kishore Nallan
10:59 AM
Maybe we should add a flag for overriding this behavior.
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
10:59 AM
Fair point. Thanks for explaining šŸ‘
11:00
Julian
11:00 AM
> Maybe we should add a flag for overriding this behavior.
That could be a decent compromise, yes šŸ‘
Sep 11, 2022 (12 months ago)
Andrew
Photo of md5-88d88db4789daa0e3abef8c3ca27772b
Andrew
08:09 AM
Julian we use the export function for this
Sep 12, 2022 (12 months ago)
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
06:10 PM
Andrew I see. Do you use this for runtime tasks with your whole data set? May I ask how many docs/collection we are talking?
Andrew
Photo of md5-88d88db4789daa0e3abef8c3ca27772b
Andrew
06:12 PM
Yes. Very roughly it's 1.5 million docs
06:12
Andrew
06:12 PM
Julian
06:13
Andrew
06:13 PM
Wait..Iam getting mixed up between threads. We export roughly 1000 docs
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
06:13 PM
OK, that's quite a lot. And you are not experiencing bandwidth issues or the like? Do you perform any further client-sided mutations to the data after fetching?
06:17
Julian
06:17 PM
OK, that's indeed another cup size. May I ask for a (very brief!) outline of your use case? Just so I know if it's anything like what we are dealing with?
Andrew
Photo of md5-88d88db4789daa0e3abef8c3ca27772b
Andrew
06:21 PM
We actually have built a kind of proxy which uses a template which contains a load of Typesense requests, executes all those requests, collates the responses, throws away unnecessary data (we'll need to keep throwing data away until Typesense implements a feature for doing this inside of complex objects), .... And then returns it all to the client in one go
06:22
Andrew
06:22 PM
We're basically using Typesense as an in-memory DB.
(We're also using it the normal way ... I.e for typo tolerant search)
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
06:26 PM
Thanks for the insight āœ… And performance for this "expensive" proxy operation is still OK?
Andrew
Photo of md5-88d88db4789daa0e3abef8c3ca27772b
Andrew
06:45 PM
It's obviously nowhere near as fast as calling Typesense directly. But it's still amazing compared to using PostgreSQL which is what we were doing before. With "one-step" templates, where Typesense executes all the queries at the same time, and then the collation runs, we are usually sub 100ms. With "dynamic" templates, where we call Typesense initially with a bunch of queries, and then call it a 2nd time using results from the first query as inputs, we are usually sub 200ms. (Example: we retrieve a court judgement, which includes a list of the judges who participated, then in the second step, we query for other judgements where those judges participated)
Sep 14, 2022 (12 months ago)
Julian
Photo of md5-309edd752adeb3e0ea515a8c8165ce45
Julian
10:14 AM
Alright, got it. Thanks for sharing, really appreciated.