Is there any way to upgrade to a rc from docker an...
# community-help
r
Is there any way to upgrade to a rc from docker and keeping the indexing data?
k
Yes simply point the new Docker image to the same data directory and Typesense will just start and index the on disk back again and be ready. There will be a small downtime during this restart process.
r
thanks!
can we install the latest rc without docker too?
on a linux server
k
Do you need a RPM or DEB?
r
DEB
k
r
thanks!
k
Welcome, let me know how it works.
r
how should we install it and keep the index?
k
1. Take a back up of the data dir just in case. 2. Stop service. 3. Install deb with:
Copy code
apt -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" -y install new-typsense.deb
It ensures that if a config file already exists on disk, it is reused.
r
thanks!
So we installed the lastest RC. The issue that we have is this: • we have a travel app with those collections indexed: ◦ attractions ◦ destinations ◦ countries ◦ users
when I search for “Paris” I expect to see Paris (from the destinations collection) to be displayed first. However, somehow the response from typesense puts these collection in the same order everytime (attractions -> destinations -> countries -> users)
message has been deleted
this is how we get the results
k
Are you querying across multiple collections using multi_search?
r
@Ioan-Andrei Batinas can answer this
i
we don't, we query each collection and merge the results
k
When you merge the results, how do you sort them? Based on the text_match score value?
i
yes
desc by text_match, assuming the biggest score is the most accurate
k
The issue here is that the query
paris
is not an exact match with
paris, france
-- Typesense does not rank strings that are shorter ahead of strings that are longer. i.e. we only look at the number of tokens matched, whether there are typos and the number of fields matching in a record against the query. Exact matching requires an exact match of the token, i.e.
paris
query will match a field with string
Paris
.
i
would multi search be a better option?
k
No, multi search will just return independent per-collection results -- it just parallelizes the query.
i
just a heads up, the name of the destination is just "Paris"
in the screenshot above, he just append the coundtry name also
the indexed value is "Paris"
k
I see. In that case, can you give me a sample data set where this problem can be illustrated?
We have other customers using this exact match feature in RC so it might be some other issue at play here.
i
giving you a exact response json is ok?
k
Yes, exact JSON response of a search against a single collection is fine.
i
`
Copy code
[
    {
        "document": {
            "attraction_name": "Mosquee de Paris",
            "coordinates": "nan",
            "country_id": "82",
            "country_name": "France",
            "destination_name": "Paris",
            "id": "880",
            "parent_destination_name": "Lhasa"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "<mark>Paris</mark>"
            },
            {
                "field": "attraction_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "Mosquee de <mark>Paris</mark>"
            }
        ],
        "text_match": 2203368317191,
        "type": "attraction"
    },
    {
        "document": {
            "attraction_name": "The Paris Catacombs",
            "coordinates": "nan",
            "country_id": "82",
            "country_name": "France",
            "destination_name": "Paris",
            "id": "906",
            "parent_destination_name": "Lhasa"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "<mark>Paris</mark>"
            },
            {
                "field": "attraction_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "The <mark>Paris</mark> Catacombs"
            }
        ],
        "text_match": 2203368317191,
        "type": "attraction"
    },
    {
        "document": {
            "attraction_name": "Eglise Saint-Etienne-du-Mont de Paris",
            "coordinates": "nan",
            "country_id": "82",
            "country_name": "France",
            "destination_name": "Paris",
            "id": "831",
            "parent_destination_name": "Lhasa"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "<mark>Paris</mark>"
            },
            {
                "field": "attraction_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "Eglise Saint-Etienne-du-Mont de <mark>Paris</mark>"
            }
        ],
        "text_match": 2203368317191,
        "type": "attraction"
    },
    {
        "document": {
            "coordinates": "2.3522,48.8566",
            "country_id": "82",
            "country_name": "France",
            "destination_name": "Paris",
            "id": "42",
            "parent_destination_name": "Lhasa"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Paris"
                ],
                "snippet": "<mark>Paris</mark>"
            }
        ],
        "text_match": 1103840043779,
        "type": "destination"
    },
    {
        "document": {
            "coordinates": "25.160855,37.080582",
            "country_id": "92",
            "country_name": "Greece",
            "destination_name": "Paros",
            "id": "986"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Paros"
                ],
                "snippet": "<mark>Paros</mark>"
            }
        ],
        "text_match": 4328350465,
        "type": "destination"
    },
    {
        "document": {
            "coordinates": "10.3280833,44.8013678",
            "country_id": "114",
            "country_name": "Italy",
            "destination_name": "Parma",
            "id": "676"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Parma"
                ],
                "snippet": "<mark>Parma</mark>"
            }
        ],
        "text_match": 4328284929,
        "type": "destination"
    },
    {
        "document": {
            "id": "211",
            "name": "Sri Lanka"
        },
        "highlights": [
            {
                "field": "name",
                "matched_tokens": [
                    "Sri"
                ],
                "snippet": "<mark>Sri</mark> Lanka"
            }
        ],
        "text_match": 33317888,
        "type": "country"
    },
    {
        "document": {
            "id": "196",
            "name": "San Marino"
        },
        "highlights": [
            {
                "field": "name",
                "matched_tokens": [
                    "Marino"
                ],
                "snippet": "San <mark>Marino</mark>"
            }
        ],
        "text_match": 33317888,
        "type": "country"
    },
    {
        "document": {
            "full_name": "LarisaNegreanu",
            "id": "98",
            "username": "larisa.negreanu"
        },
        "highlights": [
            {
                "field": "username",
                "matched_tokens": [
                    "larisa.negreanu"
                ],
                "snippet": "<mark>larisa.negreanu</mark>"
            },
            {
                "field": "full_name",
                "matched_tokens": [
                    "LarisaNegreanu"
                ],
                "snippet": "<mark>LarisaNegreanu</mark>"
            }
        ],
        "text_match": 4344668419,
        "type": "user"
    },
    {
        "document": {
            "full_name": "LarisaNegreanu",
            "id": "110",
            "username": "larisanegreanu"
        },
        "highlights": [
            {
                "field": "username",
                "matched_tokens": [
                    "larisanegreanu"
                ],
                "snippet": "<mark>larisanegreanu</mark>"
            },
            {
                "field": "full_name",
                "matched_tokens": [
                    "LarisaNegreanu"
                ],
                "snippet": "<mark>LarisaNegreanu</mark>"
            }
        ],
        "text_match": 4344668419,
        "type": "user"
    },
    {
        "document": {
            "full_name": "MariusIonescu",
            "id": "285",
            "username": "marius.ionescu"
        },
        "highlights": [
            {
                "field": "username",
                "matched_tokens": [
                    "marius.ionescu"
                ],
                "snippet": "<mark>marius.ionescu</mark>"
            },
            {
                "field": "full_name",
                "matched_tokens": [
                    "MariusIonescu"
                ],
                "snippet": "<mark>MariusIonescu</mark>"
            }
        ],
        "text_match": 4344471811,
        "type": "user"
    }
]
k
If I can't find anything obvious, I might still need a representative dataset that reproduces the issue so I can debug further locally.
i
here is our response after querring multiple collections, getting the top responses, then merging the results and sorting by text_match
k
Can you also please tell me your exact query?
i
one sec
from each collection?
all of them are
k
Which record in that JSON you would like to appear first?
i
`
Copy code
const searchParameters = {
    q: key,
    query_by: 'attraction_name, destination_name, parent_destination_name',
  }
just in different collections
k
I think the issue is with multi-field matching.
i
for countries is just `
Copy code
const searchParameters = {
    q: key,
    query_by: 'name',
  }
in the countries collection
`
Copy code
typeClient
    .collections('countries')
    .documents()
    .search(searchParameters)
k
Yes, the issue is because in countries collection you can have a match with only 1 single field, but when you query the other collection there can be many fields than can match.
For e.g.
Mosquee de Paris
record contains 2 fields which have the word
paris
Which is why the match score is higher than the
Paris, France
record.
You can try using the
query_by_weights
parameter to set a much higher weight when querying countries collection.
i
this would be just query_by_weights: 1, in this case where we are querying for only one field
k
It would be
query_by_weights: 10
when querying a single field but something like
query_by_weights: 4,3,2
when querying multiple fields. The values will depend on your exact use case. The basic gist is using weights to control relative popularity.
One can say that a match on a city or country is far more important than a match on an attraction name, so hence the higher weight.
i
we modified and got a better result, now we have users apearing higher as a score because they are indexed bu full name and username and they get 2 matches probably
if we make the 2 fields into one field and reindex, would the score be lower?
it will still hit 2 words, but would be one field not 2, would this be the case?
k
A single field match will be treated as a lower score than two field.
i
We modified our queries to go after single field in the fact that we search throw all collections, after that sorting using the text match. Results are somewhat better but some results we don't understand searching for romania we get `
Copy code
{
        "document": {
            "coordinates": "12.4964,41.9028",
            "country_id": "114",
            "country_name": "Italy",
            "destination_name": "Rome",
            "id": "44"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Rome"
                ],
                "snippet": "<mark>Rome</mark>"
            }
        ],
        "text_match": 4328219393,
        "type": "destination"
    },
    {
        "document": {
            "id": "187",
            "name": "Romania"
        },
        "highlights": [
            {
                "field": "name",
                "matched_tokens": [
                    "Romania"
                ],
                "snippet": "<mark>Romania</mark>"
            }
        ],
        "text_match": 33514498,
        "type": "country"
    },
    {
        "document": {
            "attraction_name": "Museum of the National Bank of Romania",
            "coordinates": "nan",
            "country_id": "187",
            "country_name": "Romania",
            "destination_name": "Bucharest",
            "id": "4690"
        },
        "highlights": [
            {
                "field": "attraction_name",
                "matched_tokens": [
                    "Romania"
                ],
                "snippet": "Museum of the National Bank of <mark>Romania</mark>"
            }
        ],
        "text_match": 33514496,
        "type": "attraction"
    },
and another example would be, search for a destination name called 'Peles Castle', we get it as a the 4 one with the first being Tel Aviv
`
Copy code
{
        "document": {
            "coordinates": "34.7818,32.0853",
            "country_id": "113",
            "country_name": "Israel",
            "destination_name": "Tel Aviv",
            "id": "117"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Tel"
                ],
                "snippet": "<mark>Tel</mark> Aviv"
            }
        ],
        "text_match": 4328219393,
        "type": "destination"
    },
    {
        "document": {
            "coordinates": "-15.435657,28.12295",
            "country_id": "210",
            "country_name": "Spain",
            "destination_name": "Las Palmas de Gran Canaria",
            "id": "587"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Las"
                ],
                "snippet": "<mark>Las</mark> Palmas de Gran Canaria"
            }
        ],
        "text_match": 4328219393,
        "type": "destination"
    },
    {
        "document": {
            "coordinates": "-115.1398,36.1699",
            "country_id": "236",
            "country_name": "United States",
            "destination_name": "Las Vegas",
            "id": "47"
        },
        "highlights": [
            {
                "field": "destination_name",
                "matched_tokens": [
                    "Las"
                ],
                "snippet": "<mark>Las</mark> Vegas"
            }
        ],
        "text_match": 4328219393,
        "type": "destination"
    },
    {
        "document": {
            "attraction_name": "Peleș Castle",
            "coordinates": "nan",
            "country_id": "187",
            "country_name": "Romania",
            "destination_name": "Sinaia",
            "id": "18878"
        },
        "highlights": [
            {
                "field": "attraction_name",
                "matched_tokens": [
                    "Peleș",
                    "Castle"
                ],
                "snippet": "<mark>Peleș</mark> <mark>Castle</mark>"
            }
        ],
        "text_match": 50291458,
        "type": "attraction"
    },
could you offer some insight?
k
What's the weight you are using for the field
destination_name
?
It seems like the weight of the
destination_name
field just over powers everything else.
i
in the destinations collection, there is no weight on it. Only weight we have is on the country collection with a weight of 10
searching in the destination collection i mean, has no weight to it
k
Can you please paste your updated query parameters again for all the 3 collections?
i
after conidering what you said this morning, i just changed the destination, instead of query for deatination_name and parent_destination_name, just one field
using just one field, produces more favorable results
one field for all i mean
k
You mean combine the multiple fields to a single field now?
It would be good to have the queries again so I can relook at it.
i
yeah, i modified the search in each collection to only use one field
with is the name
and the results improved greatly
when i gave the last json, i search in destinations using 2 fields that spiked the text match a lot
k
Can you place the updated query params for each collection here? I can then see if I can explain the oddities.
i
ok
Copy code
const searchParameters = {
    q: key,
    query_by: 'attraction_name',
  }
`
Copy code
const searchParameters = {
    q: key,
    query_by: 'destination_name',
  }
`
Copy code
const searchParameters = {
    q: key,
    query_by: 'collection_name',
  }
`
Copy code
const searchParameters = {
    q: key,
    query_by: 'name',
    query_by_weights: 10,
  }
last one is in countries
if you change to `
Copy code
const searchParameters = {
    q: key,
    query_by: 'destination_name, parent_destination_name',
  }
the results are really bad, in which the text_match is huge compared to the other ones
k
Got it. And the Romania example above is with a single search param correct? It's late here so I might get back to you tomorrow to resume this convo.
@Ioan-Andrei Batinas Can you post the JSON response value of the first result record when you query for
romania
on just the destination collection against
destination_name
field?
I was perplexed by why the
"destination_name": "Rome"
record had such a high match score value of
4328219393
in your JSON snippet above. So I indexed just that record in a new collection and queried for it. It is returning a different and lower match score. See here: https://gist.github.com/kishorenc/05789f80175135f7d7d5e9f0b944819f
i
We found out why the text match was hifh. It is because we were dearching for 2 fields. Query by one field returned normal
Results
k
Cool 👍