Docker Upgrade and Indexing Data Issues for Travel App
TLDR The thread discussed upgrading docker while retaining indexing data and addressed search result ranking issues in an app with collections indexed by attractions, destinations, countries, and users. Kishore Nallan provided guidance on adjusting query parameters and weights to improve search outcomes.
Jun 17, 2021 (29 months ago)
Robert
09:50 AMKishore Nallan
09:54 AMRobert
09:55 AMRobert
09:56 AMRobert
09:56 AMKishore Nallan
10:05 AMRobert
10:24 AMKishore Nallan
10:24 AMRobert
10:25 AMKishore Nallan
10:25 AMRobert
10:26 AMKishore Nallan
10:30 AM2. Stop service.
3. Install deb with:
apt -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" -y install new-typsense.deb
Kishore Nallan
10:30 AMRobert
10:31 AMRobert
11:54 AM• we have a travel app with those collections indexed:
◦ attractions
◦ destinations
◦ countries
◦ users
Robert
11:55 AMRobert
11:56 AMRobert
11:56 AMKishore Nallan
11:56 AMRobert
11:57 AMIoan-Andrei
11:57 AMKishore Nallan
11:58 AMIoan-Andrei
11:58 AMIoan-Andrei
11:59 AMKishore Nallan
12:00 PMparis
is not an exact match with paris, france
-- Typesense does not rank strings that are shorter ahead of strings that are longer. i.e. we only look at the number of tokens matched, whether there are typos and the number of fields matching in a record against the query.Exact matching requires an exact match of the token, i.e.
paris
query will match a field with string Paris
.Ioan-Andrei
12:00 PMKishore Nallan
12:01 PMIoan-Andrei
12:02 PMIoan-Andrei
12:02 PMIoan-Andrei
12:02 PMKishore Nallan
12:03 PMKishore Nallan
12:04 PMIoan-Andrei
12:04 PMKishore Nallan
12:04 PMIoan-Andrei
12:05 PM[
{
"document": {
"attraction_name": "Mosquee de Paris",
"coordinates": "nan",
"country_id": "82",
"country_name": "France",
"destination_name": "Paris",
"id": "880",
"parent_destination_name": "Lhasa"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Paris"
],
"snippet": "<mark>Paris</mark>"
},
{
"field": "attraction_name",
"matched_tokens": [
"Paris"
],
"snippet": "Mosquee de <mark>Paris</mark>"
}
],
"text_match": 2203368317191,
"type": "attraction"
},
{
"document": {
"attraction_name": "The Paris Catacombs",
"coordinates": "nan",
"country_id": "82",
"country_name": "France",
"destination_name": "Paris",
"id": "906",
"parent_destination_name": "Lhasa"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Paris"
],
"snippet": "<mark>Paris</mark>"
},
{
"field": "attraction_name",
"matched_tokens": [
"Paris"
],
"snippet": "The <mark>Paris</mark> Catacombs"
}
],
"text_match": 2203368317191,
"type": "attraction"
},
{
"document": {
"attraction_name": "Eglise Saint-Etienne-du-Mont de Paris",
"coordinates": "nan",
"country_id": "82",
"country_name": "France",
"destination_name": "Paris",
"id": "831",
"parent_destination_name": "Lhasa"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Paris"
],
"snippet": "<mark>Paris</mark>"
},
{
"field": "attraction_name",
"matched_tokens": [
"Paris"
],
"snippet": "Eglise Saint-Etienne-du-Mont de <mark>Paris</mark>"
}
],
"text_match": 2203368317191,
"type": "attraction"
},
{
"document": {
"coordinates": "2.3522,48.8566",
"country_id": "82",
"country_name": "France",
"destination_name": "Paris",
"id": "42",
"parent_destination_name": "Lhasa"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Paris"
],
"snippet": "<mark>Paris</mark>"
}
],
"text_match": 1103840043779,
"type": "destination"
},
{
"document": {
"coordinates": "25.160855,37.080582",
"country_id": "92",
"country_name": "Greece",
"destination_name": "Paros",
"id": "986"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Paros"
],
"snippet": "<mark>Paros</mark>"
}
],
"text_match": 4328350465,
"type": "destination"
},
{
"document": {
"coordinates": "10.3280833,44.8013678",
"country_id": "114",
"country_name": "Italy",
"destination_name": "Parma",
"id": "676"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Parma"
],
"snippet": "<mark>Parma</mark>"
}
],
"text_match": 4328284929,
"type": "destination"
},
{
"document": {
"id": "211",
"name": "Sri Lanka"
},
"highlights": [
{
"field": "name",
"matched_tokens": [
"Sri"
],
"snippet": "<mark>Sri</mark> Lanka"
}
],
"text_match": 33317888,
"type": "country"
},
{
"document": {
"id": "196",
"name": "San Marino"
},
"highlights": [
{
"field": "name",
"matched_tokens": [
"Marino"
],
"snippet": "San <mark>Marino</mark>"
}
],
"text_match": 33317888,
"type": "country"
},
{
"document": {
"full_name": "LarisaNegreanu",
"id": "98",
"username": "larisa.negreanu"
},
"highlights": [
{
"field": "username",
"matched_tokens": [
"larisa.negreanu"
],
"snippet": "<mark>larisa.negreanu</mark>"
},
{
"field": "full_name",
"matched_tokens": [
"LarisaNegreanu"
],
"snippet": "<mark>LarisaNegreanu</mark>"
}
],
"text_match": 4344668419,
"type": "user"
},
{
"document": {
"full_name": "LarisaNegreanu",
"id": "110",
"username": "larisanegreanu"
},
"highlights": [
{
"field": "username",
"matched_tokens": [
"larisanegreanu"
],
"snippet": "<mark>larisanegreanu</mark>"
},
{
"field": "full_name",
"matched_tokens": [
"LarisaNegreanu"
],
"snippet": "<mark>LarisaNegreanu</mark>"
}
],
"text_match": 4344668419,
"type": "user"
},
{
"document": {
"full_name": "MariusIonescu",
"id": "285",
"username": "marius.ionescu"
},
"highlights": [
{
"field": "username",
"matched_tokens": [
"marius.ionescu"
],
"snippet": "<mark>marius.ionescu</mark>"
},
{
"field": "full_name",
"matched_tokens": [
"MariusIonescu"
],
"snippet": "<mark>MariusIonescu</mark>"
}
],
"text_match": 4344471811,
"type": "user"
}
]
Kishore Nallan
12:05 PMIoan-Andrei
12:06 PMKishore Nallan
12:06 PMIoan-Andrei
12:06 PMIoan-Andrei
12:06 PMIoan-Andrei
12:07 PMKishore Nallan
12:07 PMIoan-Andrei
12:07 PMconst searchParameters = {
q: key,
query_by: 'attraction_name, destination_name, parent_destination_name',
}
Ioan-Andrei
12:07 PMKishore Nallan
12:08 PMIoan-Andrei
12:08 PM`
const searchParameters = {
q: key,
query_by: 'name',
}
Ioan-Andrei
12:08 PMIoan-Andrei
12:08 PMtypeClient
.collections('countries')
.documents()
.search(searchParameters)
Kishore Nallan
12:08 PMKishore Nallan
12:09 PMMosquee de Paris
record contains 2 fields which have the word paris
Kishore Nallan
12:10 PMParis, France
record.Kishore Nallan
12:11 PMquery_by_weights
parameter to set a much higher weight when querying countries collection.Ioan-Andrei
12:14 PMKishore Nallan
12:16 PMquery_by_weights: 10
when querying a single field but something like query_by_weights: 4,3,2
when querying multiple fields. The values will depend on your exact use case. The basic gist is using weights to control relative popularity.Kishore Nallan
12:16 PMIoan-Andrei
01:08 PMIoan-Andrei
01:09 PMIoan-Andrei
01:09 PMKishore Nallan
01:35 PMIoan-Andrei
03:34 PMsearching for romania we get
`
{
"document": {
"coordinates": "12.4964,41.9028",
"country_id": "114",
"country_name": "Italy",
"destination_name": "Rome",
"id": "44"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Rome"
],
"snippet": "<mark>Rome</mark>"
}
],
"text_match": 4328219393,
"type": "destination"
},
{
"document": {
"id": "187",
"name": "Romania"
},
"highlights": [
{
"field": "name",
"matched_tokens": [
"Romania"
],
"snippet": "<mark>Romania</mark>"
}
],
"text_match": 33514498,
"type": "country"
},
{
"document": {
"attraction_name": "Museum of the National Bank of Romania",
"coordinates": "nan",
"country_id": "187",
"country_name": "Romania",
"destination_name": "Bucharest",
"id": "4690"
},
"highlights": [
{
"field": "attraction_name",
"matched_tokens": [
"Romania"
],
"snippet": "Museum of the National Bank of <mark>Romania</mark>"
}
],
"text_match": 33514496,
"type": "attraction"
},
Ioan-Andrei
03:35 PMIoan-Andrei
03:35 PM{
"document": {
"coordinates": "34.7818,32.0853",
"country_id": "113",
"country_name": "Israel",
"destination_name": "Tel Aviv",
"id": "117"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Tel"
],
"snippet": "<mark>Tel</mark> Aviv"
}
],
"text_match": 4328219393,
"type": "destination"
},
{
"document": {
"coordinates": "-15.435657,28.12295",
"country_id": "210",
"country_name": "Spain",
"destination_name": "Las Palmas de Gran Canaria",
"id": "587"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Las"
],
"snippet": "<mark>Las</mark> Palmas de Gran Canaria"
}
],
"text_match": 4328219393,
"type": "destination"
},
{
"document": {
"coordinates": "-115.1398,36.1699",
"country_id": "236",
"country_name": "United States",
"destination_name": "Las Vegas",
"id": "47"
},
"highlights": [
{
"field": "destination_name",
"matched_tokens": [
"Las"
],
"snippet": "<mark>Las</mark> Vegas"
}
],
"text_match": 4328219393,
"type": "destination"
},
{
"document": {
"attraction_name": "Peleș Castle",
"coordinates": "nan",
"country_id": "187",
"country_name": "Romania",
"destination_name": "Sinaia",
"id": "18878"
},
"highlights": [
{
"field": "attraction_name",
"matched_tokens": [
"Peleș",
"Castle"
],
"snippet": "<mark>Peleș</mark> <mark>Castle</mark>"
}
],
"text_match": 50291458,
"type": "attraction"
},
Ioan-Andrei
03:36 PMKishore Nallan
03:41 PMdestination_name
?Kishore Nallan
03:42 PMdestination_name
field just over powers everything else.Ioan-Andrei
03:42 PMIoan-Andrei
03:43 PMKishore Nallan
03:51 PMIoan-Andrei
03:52 PMIoan-Andrei
03:52 PMIoan-Andrei
03:52 PMKishore Nallan
03:52 PMKishore Nallan
03:54 PMIoan-Andrei
04:26 PMIoan-Andrei
04:26 PMIoan-Andrei
04:26 PMIoan-Andrei
04:27 PMKishore Nallan
04:27 PMIoan-Andrei
04:27 PMIoan-Andrei
04:27 PMconst searchParameters = {
q: key,
query_by: 'attraction_name',
}
Ioan-Andrei
04:28 PMconst searchParameters = {
q: key,
query_by: 'name',
query_by_weights: 10,
}
Ioan-Andrei
04:28 PMconst searchParameters = {
q: key,
query_by: 'destination_name',
}
Ioan-Andrei
04:28 PMconst searchParameters = {
q: key,
query_by: 'collection_name',
}
Ioan-Andrei
04:29 PMIoan-Andrei
04:29 PM`
const searchParameters = {
q: key,
query_by: 'destination_name, parent_destination_name',
}
Ioan-Andrei
04:29 PMKishore Nallan
04:30 PMJun 18, 2021 (29 months ago)
Kishore Nallan
07:29 AMromania
on just the destination collection against destination_name
field?Kishore Nallan
07:36 AM"destination_name": "Rome"
record had such a high match score value of 4328219393
in your JSON snippet above. So I indexed just that record in a new collection and queried for it. It is returning a different and lower match score. See here: https://gist.github.com/kishorenc/05789f80175135f7d7d5e9f0b944819fIoan-Andrei
09:54 AMIoan-Andrei
09:54 AMKishore Nallan
09:54 AMTypesense
Indexed 2779 threads (79% resolved)
Similar Threads
Phrase Search Relevancy and Weights Fix
Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.
Issues With `text_match` Scoring for Search Queries in Typesense
Colin encountered issues with the `text_match` scoring on Typesense v0.23.1. Jason and Kishore Nallan identified a potential issue with numeric overflow in the text match score and applied an unverified patch. The final resolution is unclear.
Utilizing Vector Search and Word Embeddings for Comprehensive Search in Typesense
Bill sought clarification on using vector search with multiple word embeddings in Typesense and using them instead of OpenAI's embedding. Kishore Nallan and Jason informed him that their development version 0.25 supports open source embedding models. They also resolved Bill's concerns regarding search performance, language support, and limitations in the search parameters.
Methods for Fetching, Querying, and Modifying Collections in Typesense
Bill inquired about performing OR queries, querying empty arrays and modifying collections in Typesense. Kishore Nallan explained the current limitations and provided workarounds and recommendations for each case. The conversation also touched upon the usage of cache in Typesense and the workings of the _eval function.
Revisiting Typesense for Efficient DB Indexing and Querying
kopach experienced slow indexing and crashes with Typesense. The community suggested to use batch import and check the server's resources. Improvements were made but additional support was needed for special characters and multi-search queries.