Hi I have a few questions going through my mind Maybe someon typesense #community-help

Hi I have a few questions going through my mind. M...

Alexander Zierhut

05/08/2022, 3:51 PM

Hi I have a few questions going through my mind. Maybe someone is willing to tell me their perspective. (open) 1. Is it good practice or at least normal to have a copy of each indexed record in a relational database as to be able to do queries. I am mostly thinking of how to reframe use cases like list all products a user has in their shopping basket, which in my book requires to pass a whole list to the search service. What is the way to go? (open) 2. Is it possible to link collections, in other words have nested documents like: {meta1: 123, meta2: 123, subitems: [{itemvalue1: 123, itemvalue2: 456}, {itemvalue1: 123, itemvalue2: 456}]} (open) 3. I have already implemented ranking in a a project, but is it possible to do "user based" ranking i.e. when searching for YouTube videos, your personally frequently watched videos are ranked why higher. (open) 4. Is search something used for recommendations based on earlier usage?

Mac Cordingley

05/08/2022, 7:02 PM

Most modern RDBMS systems can handle json documents these days, it really depends on how performant you want things to be.

Alexander Zierhut

05/08/2022, 7:03 PM

But they can't search them like typesense. That's why I thought of having to synced systems. Maybe there is a better way @Mac Cordingley

Mac Cordingley

05/08/2022, 7:04 PM

postgres does a fair attempt at this https://www.postgresql.org/docs/9.3/functions-json.html

Mac Cordingley

05/08/2022, 7:05 PM

but again really depends on the use case

Mac Cordingley

05/08/2022, 7:07 PM

you may be better to look at a graphql layer specifically being able to aggregate data from many data sources but present a concise schema and just abstract that away from the user

Mac Cordingley

05/08/2022, 7:08 PM

I know that the guys here are activley looking into this type of thing, but in the mean time take a look at graphql-mesh

Mac Cordingley

05/08/2022, 7:09 PM

https://www.graphql-mesh.com/docs/getting-started/overview

Alexander Zierhut

05/08/2022, 7:14 PM

Tbh. I can't really see how this relates. I mean sure there are JSON capable systems. For example MariaDB has a fully fledged engine for that too: https://mariadb.com/kb/en/json-functions/ I don't actively use graphql, but I am guessing that this mesh you are referring to is the same as what I am proposing to do already. Merge data from multiple sources. @Mac Cordingley do you have an idea regarding the last part of the question "to pass a whole list to the search service"? I think this arises out of wanting to merge sources.

Alexander Zierhut

05/08/2022, 7:16 PM

Thank you for your response @Mac Cordingley

Mac Cordingley

05/08/2022, 7:17 PM

The crawler is quite versatile, if you wanted to you just have expose your data in a way that it can ingest it aka a local web server, host a flat file that the crawler can use. essentially you can index pretty much anything

Mac Cordingley

05/08/2022, 7:18 PM

Again it come back to performance and high availabity needs

Mac Cordingley

05/08/2022, 7:20 PM

graphql is excellent for removing the load on the client side as you only get the data you ask for unlike REST/SOAP etc. But its fairly cumbersome toi get your stitched schemas right. It takes some thought, let me put it that way

Mac Cordingley

05/08/2022, 7:21 PM

and you depending on you datasources, you can introduce latency easily then re assess your indexes etc

Alexander Zierhut

05/08/2022, 7:22 PM

I now kinda see where you are going with this. I am actually asking on a way more abstract layer even before thinking of how to send data to the user as this is something figured out. Everone has their favorite tools for that. 😆

Mac Cordingley

05/08/2022, 7:22 PM

lol

Mac Cordingley

05/08/2022, 7:22 PM

yep

Mac Cordingley

05/08/2022, 7:23 PM

think about cloudflare workers also

Alexander Zierhut

05/08/2022, 7:23 PM

It's not like don't enjoy GraphQL from time to time though.

Mac Cordingley

05/08/2022, 7:23 PM

im actually an MS SQL vet

Alexander Zierhut

05/08/2022, 7:24 PM

What would Cloudflare workers relate to this? Are we still talking about question 1?

Mac Cordingley

05/08/2022, 7:25 PM

i use workers to cache data from grapphql on the edge network

Mac Cordingley

05/08/2022, 7:25 PM

so you bring down latency by a mile

Mac Cordingley

05/08/2022, 7:25 PM

i dont what your use case is really

Kishore Nallan

05/09/2022, 7:37 AM

1. Yes you should sync data from your primary database into Typesense. Typesense is not meant to be used as a primary transactional database. 2. We don't support nested fields or collection level joins at the moment but nested fields has been prioritised for the next version (ETA roughly 2 months) 3&4: Typesense does not support search personalization or recommendation at the moment but certainly these are areas we are interested in pursuing in future as there is a lot of overlap with search.

Alexander Zierhut

05/17/2022, 8:53 AM

@Kishore Nallan How does someone even go about implementing recommendation based search?

Kishore Nallan

05/19/2022, 9:05 AM

@Alexander Zierhut Typesense does not support this directly yet, but we have it on our roadmap. When we support vector indices, we could offer a way to pull up semantically similar data based on title etc. In the mean time, a simple heuristic that I've seen work really well is to suggest other items in the same category which are numerically similar (e.g. same price range).

Alexander Zierhut

05/19/2022, 8:38 PM

@Kishore Nallan I understand. Thank you for the idea, this is what I am doing too. My original question targeted more the "personal search" part. How does someone return different results based on user preferences i.e. a 10k item long watch history. I am not asking you to implement it, I am simply curious about how someone would even go about that? My only guess is to compress preferences to a few categories and their markedness

Kishore Nallan

05/20/2022, 1:29 AM

Personalization requires machine learning, See this post for various ways it is done: https://eugeneyan.com/writing/patterns-for-personalization/

Open in Slack

Previous Next