Hi I have a few questions going through my mind. M...
# community-help
a
Hi I have a few questions going through my mind. Maybe someone is willing to tell me their perspective. (open) 1. Is it good practice or at least normal to have a copy of each indexed record in a relational database as to be able to do queries. I am mostly thinking of how to reframe use cases like list all products a user has in their shopping basket, which in my book requires to pass a whole list to the search service. What is the way to go? (open) 2. Is it possible to link collections, in other words have nested documents like: {meta1: 123, meta2: 123, subitems: [{itemvalue1: 123, itemvalue2: 456}, {itemvalue1: 123, itemvalue2: 456}]} (open) 3. I have already implemented ranking in a a project, but is it possible to do "user based" ranking i.e. when searching for YouTube videos, your personally frequently watched videos are ranked why higher. (open) 4. Is search something used for recommendations based on earlier usage?
m
Most modern RDBMS systems can handle json documents these days, it really depends on how performant you want things to be.
a
But they can't search them like typesense. That's why I thought of having to synced systems. Maybe there is a better way @Mac Cordingley
m
postgres does a fair attempt at this https://www.postgresql.org/docs/9.3/functions-json.html
but again really depends on the use case
you may be better to look at a graphql layer specifically being able to aggregate data from many data sources but present a concise schema and just abstract that away from the user
I know that the guys here are activley looking into this type of thing, but in the mean time take a look at graphql-mesh
a
Tbh. I can't really see how this relates. I mean sure there are JSON capable systems. For example MariaDB has a fully fledged engine for that too: https://mariadb.com/kb/en/json-functions/ I don't actively use graphql, but I am guessing that this mesh you are referring to is the same as what I am proposing to do already. Merge data from multiple sources. @Mac Cordingley do you have an idea regarding the last part of the question "to pass a whole list to the search service"? I think this arises out of wanting to merge sources.
Thank you for your response @Mac Cordingley
m
The crawler is quite versatile, if you wanted to you just have expose your data in a way that it can ingest it aka a local web server, host a flat file that the crawler can use. essentially you can index pretty much anything
Again it come back to performance and high availabity needs
graphql is excellent for removing the load on the client side as you only get the data you ask for unlike REST/SOAP etc. But its fairly cumbersome toi get your stitched schemas right. It takes some thought, let me put it that way
and you depending on you datasources, you can introduce latency easily then re assess your indexes etc
a
I now kinda see where you are going with this. I am actually asking on a way more abstract layer even before thinking of how to send data to the user as this is something figured out. Everone has their favorite tools for that. 😆
m
lol
yep
think about cloudflare workers also
a
It's not like don't enjoy GraphQL from time to time though.
m
im actually an MS SQL vet
a
What would Cloudflare workers relate to this? Are we still talking about question 1?
m
i use workers to cache data from grapphql on the edge network
so you bring down latency by a mile
i dont what your use case is really
k
1. Yes you should sync data from your primary database into Typesense. Typesense is not meant to be used as a primary transactional database. 2. We don't support nested fields or collection level joins at the moment but nested fields has been prioritised for the next version (ETA roughly 2 months) 3&4: Typesense does not support search personalization or recommendation at the moment but certainly these are areas we are interested in pursuing in future as there is a lot of overlap with search.
a
@Kishore Nallan How does someone even go about implementing recommendation based search?
k
@Alexander Zierhut Typesense does not support this directly yet, but we have it on our roadmap. When we support vector indices, we could offer a way to pull up semantically similar data based on title etc. In the mean time, a simple heuristic that I've seen work really well is to suggest other items in the same category which are numerically similar (e.g. same price range).
a
@Kishore Nallan I understand. Thank you for the idea, this is what I am doing too. My original question targeted more the "personal search" part. How does someone return different results based on user preferences i.e. a 10k item long watch history. I am not asking you to implement it, I am simply curious about how someone would even go about that? My only guess is to compress preferences to a few categories and their markedness
k
Personalization requires machine learning, See this post for various ways it is done: https://eugeneyan.com/writing/patterns-for-personalization/