#community-help

Discussing Data Retrieval in Typesense Cloud Tool

TLDR Ricardo inquired about the impact of using non-searched fields in data records with Typesense. Jason explained that all fields are fetched from the disk, even if unindexed, pointing out it might not affect performance, with the benefit of reducing separate database API calls.

Powered by Struct AI

1

17
31mo
Solved
Join the chat
Apr 30, 2021 (31 months ago)
Ricardo
Photo of md5-914a8b39b82fd99b8ecd985427660deb
Ricardo
07:56 PM
the consequence of this I assume is that my data will take longer to retrieve if I want to retrieve those fields (not search through them)? Anything else I should be aware of?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
07:57 PM
That's pretty much it. We do use RocksDB to store these docs on disk, so it should still be fast to retrieve. But if you notice any performance issues, using SSDs would help
Ricardo
Photo of md5-914a8b39b82fd99b8ecd985427660deb
Ricardo
08:01 PM
so question, why would one use this? dataset too big and can't fit on ram?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:07 PM
This would be useful for cases where you don't need to search through all the data in your record, but still want to use the data from the record for say display purposes. So instead of having to make an API call to Typesense first to search through the data and then separately make an API call to your database to fetch other fields for the record, you can put all related data in Typesense to reduce multiple API calls
08:09
Jason
08:09 PM
For eg: let's say you're storing metadata about videos in your records. You want to allow users to search by title and author of the video, and you also want to link to say the Youtube link in your search UI.

Though you're not searching directly in the Youtube link field, you can still store it in the record, so you can use to render your search UI efficiently, with just the response from Typesense
Ricardo
Photo of md5-914a8b39b82fd99b8ecd985427660deb
Ricardo
08:20 PM
thanks for the detailed explanation
08:21
Ricardo
08:21 PM
that's what I was going to use it for. but I'm concerned it might impact the results if they all have this extra field that I don't search through. that said thinking about it, the extra database call, wouldn't be any better.
08:21
Ricardo
08:21 PM
This isn't a concern now I'm just getting started
08:22
Ricardo
08:22 PM
but is there a way to keep it in memory but not be searchable?
08:22
Ricardo
08:22 PM
ignore that I define what fields get searched anyway, so it doesn't matter
08:23
Ricardo
08:23 PM
thanks ๐Ÿ™‚
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
08:42 PM
I'd recommend benchmarking to see how much performance impact there is if you add non-indexed fields to the document. I'd suspect it's minimal based on what I've seen. For eg, in the songs showcase I do exactly what I mentioned above - the URLs for each song are unindexed and stored on disk and it still seems pretty fast.
May 01, 2021 (31 months ago)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
02:01 AM
Correction to what I said earlier: once the list of document IDs are determined from in-memory indices, we actually fetch all fields from disk (unless specified otherwise in the include_fields param) to assemble the final result document, regardless of whether it's indexed or not. So performance will be identical.
Ricardo
Photo of md5-914a8b39b82fd99b8ecd985427660deb
Ricardo
05:31 AM
thanks
May 02, 2021 (31 months ago)
Ricardo
Photo of md5-914a8b39b82fd99b8ecd985427660deb
Ricardo
06:12 AM
Jason just a follow up on previous statement. So even if all my fields are in memory indices, everything (or what's specificied in include_fields) still gets fetched from the disk?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
03:15 PM
Thatโ€™s correct
03:17
Jason
03:17 PM
Everything = the final documents that will be returned as part of the response, so 10 documents by default since thatโ€™s the default per_page value

1