#community-help

Nesting, Vector Search, and Scalability in Typesense

TLDR Vishal asks about nesting, vector search, and scalability in Typesense. Kishore Nallan explains that nesting can be done to any level, vectors can be stored at any level, and the hnsw library is used for cos similarity implementation.

Powered by Struct AI
29
7mo
Solved
Join the chat
May 09, 2023 (7 months ago)
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
02:16 AM
Question - lets say I have automobile manufacturers as collections (honda, toyota, etc), within each manufacturer, i want to store multiple models with attributes for those models; how does this work with typesense? Is this possible? How far does the nesting go?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:35 AM
You can nest to any level. If you can post a sample document, I can comment further.
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
02:46 AM
Would look like this:
• honda
◦ country of origin
◦ otherstats1
◦ otherstats2
◦ automobiles
▪︎ model_id: 01
• model_name: accord
• model_year: 2020
• model_color: blue
◦ color_attribute1:34
◦ color_attribute2:343
▪︎ model_id:02
• model_name: accord
• model_year: 2020
• model_color: blue
◦ color_attribute1:34
◦ color_attribute2:343
02:47
Vishal
02:47 AM
so would i be able to search for models with color attributes above a certain value?
02:47
Vishal
02:47 AM
(across all manufacturers honda, toyota, etc)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:48 AM
Yes but only a single document will store everything?
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
02:49 AM
I am considering each model_id to be a separate document
02:49
Vishal
02:49 AM
is this correct?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:51 AM
Yes correct and you can use group_by if you want to aggregate on a brand
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
02:52 AM
ok, so in sql, one would likely make two tables: one for the highest level (manufacturers) and one for car models (models) and join the two on some key; in typesense everything can just be nested infinitely?
02:53
Vishal
02:53 AM
also, for vector search, this will also work at any level of nesting?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
02:57 AM
For vector search you've to choose to vectorize a given string.
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
02:58 AM
for vector search, i expect to be able to store a n-dimensional vector as an array at any arbitrary level of nesting
02:59
Vishal
02:59 AM
(eg in the example i gave, assume model_id has a model_vector and color_attribute1 has a color_attribute1_vector)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:01 AM
Vectors need a large context to be effective. You can combine several field values and then vectorize that.
03:01
Kishore Nallan
03:01 AM
As long as you follow the same order in all records
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
03:01 AM
understood, the generation of said vectors would happen external to this and not necessarily index the text fields therein
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:02 AM
In next version of typesense we can do this augmentation of fields automatically to generate a vector
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
03:02 AM
which model would you use?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:02 AM
But yes it must be a separate field which you can then vector outside.
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
03:02 AM
ok -- 2000 dimension vector is fine I assume?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:02 AM
We support a few. E.g. sentence transformers models.
03:03
Kishore Nallan
03:03 AM
We also support openAI integration
Vishal
Photo of md5-178450ab9171fe1c7eba3a5eb7e1a312
Vishal
03:03 AM
understood, will likely be using something from transformers
03:04
Vishal
03:04 AM
ok, one more q - the cos similarity implementation is scalable and on par with faiss?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:05 AM
We use hnsw library which is pretty robust
03:08
Vishal
03:08 AM
ok -- thanks for all of the help, big fan of typesense from what i've seen so far!
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:20 AM
Thank you!

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3011 threads (79% resolved)

Join Our Community

Similar Threads

Utilizing Vector Search and Word Embeddings for Comprehensive Search in Typesense

Bill sought clarification on using vector search with multiple word embeddings in Typesense and using them instead of OpenAI's embedding. Kishore Nallan and Jason informed him that their development version 0.25 supports open source embedding models. They also resolved Bill's concerns regarding search performance, language support, and limitations in the search parameters.

11

225
4mo
Solved

Integrating Semantic Search with Typesense

Krish wants to integrate a semantic search functionality with typesense but struggles with the limitations. Kishore Nallan provides resources, clarifications and workarounds to the raised issues.

6

75
11mo
Solved

Announcement: General Availability of Typesense v0.25.0

Jason announces release of Typesense v0.25.0, listing new features. Users express excitement and ask pertinent questions. Gorkem, Manuel, and Daniel commend the team for the new functionalities. Manish and Tugay share their positive experiences with Typesense. Jason and Kishore Nallan answer questions and thank users for their feedback.

170

24
3mo
Solved

Discussion on Performance and Scalability for Multiple Term Search

Bill asks the best way for multi-term searches in a recommendation system they developed. Kishore Nallan suggested using embeddings and remote embedder or storing and averaging vectors. Despite testing several suggested solutions, Bill continued to face performance issues, leading to unresolved discussions about scalability and recommendation system performance.

3

105
1w

Discussion on Vector and Keyword Search in Typesense

robert showed interest in using Typesense for vector and keyword search, and asked about memory storage and cost-effective methods. Jason gave helpful feedback and discussed potential capabilities in future updates.

6

18
10mo
Solved