Hello every one , we are facing some inconsistent ...
# community-help
d
Hello every one , we are facing some inconsistent result on large dataset with nested fields (i didn't notice problem up to 1M document; but now testing with 17M document i see issues below). we are using docker image typesense:29.0.rc20 (because we need this commit for stable operation). I m not sure how to debug, what shall i look for ? did i do something wroong in the schema? • the filter does return data that do not match: query:
{q:'*',filter_by:'on_sale:true && sale.type:EnglishAuction',include_fields: 'price,on_sale,sale.price,sale.type,sale.primary', per_page:20}
Copy code
{"facet_counts" => [],
 "found" => 1640,
 "hits" =>
  [{"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 260, "sale" => {"price" => 260, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 400, "sale" => {"price" => 400, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 100, "sale" => {"price" => 100, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 220, "sale" => {"price" => 220, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 520, "sale" => {"price" => 520, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 550, "sale" => {"price" => 550, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 320, "sale" => {"price" => 320, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 320, "sale" => {"price" => 320, "primary" => false, "type" => "SingleSaleOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 4180, "sale" => {"price" => 4180, "primary" => false, "type" => "SingleSaleOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 540, "sale" => {"price" => 540, "primary" => false, "type" => "SingleSaleOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 330, "sale" => {"price" => 330, "primary" => false, "type" => "SingleSaleOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 590, "sale" => {"price" => 590, "primary" => false, "type" => "SingleSaleOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 410, "sale" => {"price" => 410, "primary" => false, "type" => "SingleSaleOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "price" => 90, "sale" => {"price" => 90, "primary" => true, "type" => "EnglishAuction"}}, "highlight" => {}, "highlights" => []}],
 "out_of" => 15712569,
 "page" => 1,
 "request_params" => {"collection_name" => "blockchain_cards-1", "first_q" => "*", "per_page" => 20, "q" => "*"},
 "search_cutoff" => false,
 "search_time_ms" => 9}
• Order is not respected query, data seems somewhat orderd and i noticed that
desc
order is way more consistent than
asc
order
{q:'*',filter_by:'on_sale:true && sale.type:PrimaryOffer && sale.price:>1000' ,sort_by:'sale.price:asc',include_fields: 'on_sale,sale.price,sale.type,sale.primary', per_page:20}
Copy code
{"facet_counts" => [],
 "found" => 3581,
 "hits" =>
  [{"document" => {"on_sale" => true, "sale" => {"price" => 2290, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 5300, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 6200, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 1220, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 6200, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 6200, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []},
   {"document" => {"on_sale" => true, "sale" => {"price" => 6200, "primary" => true, "type" => "PrimaryOffer"}}, "highlight" => {}, "highlights" => []}],
 "out_of" => 15712568,
 "page" => 1,
 "request_params" => {"collection_name" => "blockchain_cards-1", "first_q" => "*", "per_page" => 20, "q" => "*"},
 "search_cutoff" => false,
 "search_time_ms" => 13}
and here the relavant part of the schema (the schema has 139 fields i filtred the one that are involved)
Copy code
{"created_at" => 1747317842,
 "default_sorting_field" => "",
 "enable_nested_fields" => true,
 "fields" =>
  [
   {"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "on_sale", "optional" => true, "sort" => true, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "bool"},
   {"facet" => false, "index" => true, "infix" => false, "locale" => "", "name" => "sale", "optional" => true, "sort" => false, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "object"},
   {"facet" => false, "index" => true, "infix" => false, "locale" => "", "name" => "sale.end_date", "optional" => true, "sort" => true, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "int64"},
   {"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "sale.id", "optional" => true, "sort" => false, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "string"},
   {"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "sale.price", "optional" => true, "range_index" => true, "sort" => true, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "int64"},
   {"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "sale.primary", "optional" => true, "sort" => false, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "bool"},   {"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "sale.price_range", "optional" => true, "sort" => false, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "string"},
   {"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "sale.type", "optional" => true, "sort" => false, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "string"},
   {"facet" => false, "index" => true, "infix" => false, "locale" => "", "name" => "sale.type_ranked", "optional" => true, "sort" => true, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "int64"}
]
 "name" => "blockchain_cards-1",
 "num_documents" => 15712567,
 "symbols_to_index" => [],
 "token_separators" => []}
Thanks!
• i m using the ruby client but i confirmed with curl that i have those issues • i was able to reproduce those issues only if filtering/sorting on nested fields.
i m wondering if this is linked to bug fixed in commit_1 and commit_2
k
Copy code
on_sale:true && sale.type:EnglishAuction
Is not a valid filter by clause because
sale.type:EnglishAuction
has no value on the RHS to match on.
d
i m not sure i understand the error in the filter
k
The format is
sale.type:EnglishAuction: X
the
X
is missing.
Same problem with the other filter query in the snippet above:
Copy code
on_sale:true && sale.type:PrimaryOffer && sale.price:>1000
It must be
sale.type:PrimaryOffer: X
d
sale.type is a string
{"facet" => true, "index" => true, "infix" => false, "locale" => "", "name" => "sale.type", "optional" => true, "sort" => false, "stem" => false, "stem_dictionary" => "", "store" => true, "type" => "string"}
if i want to filter all sales that have in
sale.type
equal to
EnglishAuction
what should i put in the filter than
k
My bad, sorry. That colon looked like a "." to me 😞
Pls ignore what I said. Relooking!
d
thank you very much 🙏
k
Is
sale
an array of object?
I think one of the two commits you have pointed out could be an issue here.
Let me check if we have a more recent build that contains those fixes.
d
no it should not howver it can change , i m supposing that updating a document with a new sale would replace the old one and not create an array of 2 (we use import with emplace and only updating the sale part)
k
We don't yet have a build with that latest fix for boolean + nested field. I will be able to share one in 4-5 hours.
d
thanks
i ll try to build locally and tell you if it get fixed
👍 1
latest version does not seems to resolve the issue. seems less random (may be related to the node starting fresh vs a node that is there since few days). we have 5 nodes cluster
curl <http://typesense>-${i}:8108/multi_search -X POST -H 'Content-Type: application/json' -d '{"searches": [{"collection": "blockchain_cards","q":"*","filter_by":"on_sale:true && sale.type:EnglishAuction","include_fields": "price,on_sale,sale.price,sale.type,sale.primary,id", "per_page":30}]}'| jq '.results[0]|.found'
(where i range from 0 tp4 included and for is the node upgraded) searching on the 5
Copy code
1506 --> 29-rc20
1506 --> 29-rc20
1506 --> 29-rc20
null  --> REBOOTING
895  --> LATEST MASTER
But still result are inconsistent
Copy code
curl -H "x-typesense-api-key: $TYPESENSE_PROD_KEY" <http://typesense-4:8108/multi_search> -X POST -H 'Content-Type: application/json' -d '{"searches": [{"collection": "blockchain_cards","q":"*","filter_by":"on_sale:true && sale.type:EnglishAuction","include_fields": "price,on_sale,sale.price,sale.type,sale.primary,id", "per_page":30}]}'| jq '.results[0]|.hits[]| .document.sale.type'
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"SingleSaleOffer"
"EnglishAuction"
"PrimaryOffer"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
k
What happens when you drop the on_sale filter cause and keep only the sale.type?
Want to understand if the problem happens along with the && clause or happens even when querying only that field.
d
checking. document with on_sale=false does not have most of time a sale object
same issue
Copy code
curl  -H "x-typesense-api-key: $TYPESENSE_PROD_KEY" <http://typesense-4:8108/multi_search> -X POST -H 'Content-Type: application/json' -d '{"searches": [{"collection": "blockchain_cards","q":"*","filter_by":"sale.type:EnglishAuction","include_fields": "price,on_sale,sale.price,sale.type,sale.primary,id", "per_page":30}]}'| jq '.results[0]|.hits[]| .document.sale.type'
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"SingleSaleOffer"
null
null
"EnglishAuction"
"PrimaryOffer"
"EnglishAuction"
"EnglishAuction"
null
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
null
"EnglishAuction"
null
"EnglishAuction"
null
"EnglishAuction"
null
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
"EnglishAuction"
null
k
So weird. One way to debug the issue is if you can export out only that field (along with id), reindex that on a new collection and try the same query again. If that reproduces you can share the dataset with us and we can check what's happening.
d
ok i ll do that
(the dataset is 17M doc 🙂 )
side note i gave this example of field but i have issue with other fields also
it was just an example.
Copy code
curl -H "x-typesense-api-key: $TYPESENSE_PROD_KEY" <http://typesense-4:8108/multi_search> -X POST -H 'Content-Type: application/json' -d '{"searches": [{"collection": "blockchain_cards","q":"*","filter_by":"sale.primary:true","include_fields": "price,on_sale,sale.price,sale.type,sale.primary,id", "per_page":50}]}'| jq '.results[0]|.hits[]| .document.sale.primary'
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
null
true
true
true
null
true
null
null
null
null
true
true
true
null
true
true
true
true
true
true
true
k
Are all the fields you have problem with object fields?
d
yes they are object field.
i don't know if it's a factor, we have a lot of update on our dataset like few millions per day
k
We will need some way to reproduce the issue on our end. Happy to debug and get to the bottom of it
🙏 1
d
i created an index with
[id,sale.type]
but i was not able to reproduce the inconsistency
i m trying twith
[id,on_sale,sale.type]
my idea is to reduce the dataset to a minimal scope
another theory i have is that's related to somwhat to updates for document that has at some point one type of sale and now hav a new one
k
If updates are an issue then we might need to simulate that.
d
also i m trying to run a cluster on v28 to see if i have error there
i need some time to make my code double write
k
That's a good hunch. Maybe the on disk status and in-memory status has diverged
Once you do the double write and reproduce, we can just copy the data directory of the test node (without doing snapshot) to grab the state.
d
my understanding is typesense store only document in rocksdb and on startup it reindex . yesterday i have incosistent result just after startup
there may be some update though as it takes 3 hour to start and than apply all diff
k
The updates are replayed from a log
d
before restart i did compact/snapshot
k
Better not to do snapshot because that will not replay the logs which is what would trigger any update issues.
d
i m able to reproduce with minimal example. validating on simple index
k
Awesome 🙏
d
Copy code
# drop the collection if it exists
DELETE <http://localhost:8200/collections/minimal_test_collection>
X-TYPESENSE-API-KEY: {{api_key}}

### create the collection

POST <http://localhost:8200/collections>
content-type: application/json
x-typesense-api-key: {{api_key}}

{"name": "minimal_test_collection", "enable_nested_fields": true, "fields": [{"name": "id", "type": "string"}, {"name": "sale.type", "type": "string", "optional": true}]}

#### add 1 document
 
POST <http://localhost:8200/collections/minimal_test_collection/documents/import?action=emplace>
content-type: application/json
x-typesense-api-key: {{api_key}}

{"id": "test_slug", "sale": {"type": "EnglishAuction"}}

### update sale to null

POST <http://localhost:8200/collections/minimal_test_collection/documents/import?action=emplace>
content-type: application/json
x-typesense-api-key: {{api_key}}

{"id": "test_slug", "sale": null}
###


POST <http://localhost:8200/collections/minimal_test_collection/documents/import?action=emplace>
content-type: application/json
x-typesense-api-key: {{api_key}}

{"id": "test_slug", "sale": {"type": "SingleSaleOffer"}}

### 

GET <http://localhost:8200/collections/minimal_test_collection/documents/test_slug>
X-TYPESENSE-API-KEY: {{api_key}}

###
POST <http://localhost:8200/multi_search>
content-type: application/json
x-typesense-api-key: {{api_key}}

{
    "searches": [
        {
            "collection": "minimal_test_collection",
            "q": "*",
            "filter_by": "sale.type:EnglishAuction",
            "include_fields": "sale.type,id",
            "per_page": 200
        }
    ]
}
short version is when enabling nested_field if i set object to null it's sub object stays in the index in this case if i "emplace"
Copy code
{
  "id": "test_slug",
  "sale": {
    "type": "EnglishAuction"
  }
}
than
Copy code
{
  "id": "test_slug",
  "sale": null
}
the search on sale.type:EnglishAuction will return the document
even if the document itself is returned without a sale field
k
Got it. When you set the object to
null
-- can you try with
action=upsert
?
I suspect that with emplace,
"sale": null
is treated as just
"sale": {}
-- i.e. object being unchanged. Upsert might treat it as deletion.
d
yes upsert works
with upsert the docyument is not returned in search howvr emplace it still return
k
Both
update
and
emplace
will have this quirk of treating
null
as
{}
d
but when fetching the docuemnt the sale object is returned as null
and than when updating with another sale.type the reference of the old sale.type continue to exist
k
Yes, this is certainly a bug. I'm trying to recall why this distinction was implemented. I will be taking a look and fixing it.
d
thank you 🙏
another point to take into account (or at least in our case) we need to have a way where we can remove the sale object on a document without updating the full document
k
Thanks for helping reproduce it. I will post an update once I have a patch. Might take a couple of days.
Making
null
behave like deletion on emplace will be the fix.
d
🙏
k
For the specific issue highlighted above, I've a patch in the
29.0.rc25
build we just published.
d
i ll test today
thank you
👍 1
i still see the error in prod , but not able to reproduce with my example. i m trying to isolate what is different. For now i have 1 pod over five that did reboot, (the full rolling restart takes 16-17h) . could that explain why i still have the issue ? my initial guess is that the leader send the document/update and each follower run it locally, and expect the upgraded node to have the correct data., but may be i m missing something here (for example the leader decide which field get updated)
(and the number of error raises with time--> it's the update patteren that cause the issue, may be another path though)
i m not able to reporduce for now by either sending the update on the rc24 or rc25 in both cases rc24 send wrong answer and rc25 sends the correct one
but prod has definetly bad data
Copy code
curl -H "x-typesense-api-key: $TYPESENSE_PROD_KEY" <http://typesense-4:8108/multi_search> -X POST -H 'Content-Type: application/json' -d '{"searches": [{"collection": "blockchain_cards","q":"*","filter_by":"sale.type:EnglishAuction","include_fields": "sale.type,id", "per_page":200}]}'| jq -c '.results[0]|[.found,(.hits[]| .document.sale.type)]'
[1089,"EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction",null,"EnglishAuction",null,null,null,"EnglishAuction","EnglishAuction",null,null,null,null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction","SingleSaleOffer","EnglishAuction","EnglishAuction",null,"EnglishAuction",null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction","EnglishAuction",null,"EnglishAuction",null,null,null,null,"EnglishAuction",null,null,null,null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction",null,null,"EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction",null,"EnglishAuction",null,null,"EnglishAuction","EnglishAuction","EnglishAuction",null,null,"EnglishAuction",null,"EnglishAuction",null,"EnglishAuction",null,null,null,null,null,"SingleSaleOffer",null,null,null,"EnglishAuction","EnglishAuction",null,"EnglishAuction",null,null,null,"EnglishAuction",null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction",null,null,null,"SingleSaleOffer",null,null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction",null,null,null,"EnglishAuction",null,null,null,null,null,"EnglishAuction",null,null,null,null,"EnglishAuction",null,null,null,null,null,null,null,null,null,null,null,"SingleSaleOffer",null,null,null,"EnglishAuction","EnglishAuction",null,"EnglishAuction",null,null,"EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction","SingleSaleOffer","EnglishAuction","EnglishAuction",null,null,"EnglishAuction","EnglishAuction",null,null,null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction","EnglishAuction",null,null,"EnglishAuction","EnglishAuction",null,"EnglishAuction","EnglishAuction","EnglishAuction","EnglishAuction"]
i will change the leader to v29 and see if it improve for next nodes rebooting
k
All writes, regardless of which node you send them, is sent to the leader first. Then leader commits the write to the raft log, which is propagated to all the followers. Each follower will then index locally.
d
so if it does not work now it will not work when the rollout will end?
k
Do you see the node with rc25 return bad data?
d
yes
k
in both cases rc24 send wrong answer and rc25 sends the correct one
Then what do you mean by "correct one" here?
d
with my test they return correct data but not in prod. so there is may be another ingridient to take into account
k
Yes, I think that will be the likely explanation. Are you also sending an empty
{ }
expecting that to delete the data?
d
if you see in the example above i have null retruned
k
Does it exhibit the issue of it returning a wrong filter value? E.g. you query for X but you get Y back?
d
yes
k
Then there must be another unaccounted condition.
Maybe setting
{sale: {type: null}}
in the update?
d
but immediately after startup i do not see any error (yesterday it was not the case, when the node catchup the latest updates it start to have inconsistent data) today , only after a while when the update happens
k
On restart the state will be clean, so this is definitely an update issue. Maybe less frequent because the specific trigger update value is less frequent.
d
Maybe setting {sale: {type: null}} in the update?
the code should not allow this either we have a full sale or not. the object is only created when there is a sale and type is the "class" of object
double checking though
i(see the error on bolean field primary:true/false that return wrong data also)
k
Maybe if there is a way to enable some logging for some time on updates and then maybe you can track a bad record back to logs to see what exact update caused it.
d
do you think about some typesense logs? i should have logs on all operation that do updates. i do not have the full detail of what was sent but what it's intended to be done
i ll try to compile that and see if i can find what is the path that lead to the issue.
k
Not typesense log, but log update values in your application logs.
d
if you have some educated guess what i look for it would be helpfull
k
I suspect it's some form of null value again.
d
i am able to reproduce reliably again. the thing that was missing in previous config is in the schema we define sale as object. if we add it the bug show up again
Copy code
### drop the collection if it exists
DELETE <http://localhost:8108/collections/minimal_test_collection>
X-TYPESENSE-API-KEY: {{local_api_key}}

### create the collection

POST <http://localhost:8108/collections>
content-type: application/json
x-typesense-api-key: {{local_api_key}}

{"name": "minimal_test_collection", "enable_nested_fields": true, "fields": [{"name": "id", "type": "string"},{"name":"sale", "type": "object", "optional": true}, {"name": "sale.type", "type": "string", "optional": true, "facet": false}]}

#### add 1 document
 
POST <http://localhost:8108/collections/minimal_test_collection/documents/import?action=emplace>
content-type: application/json
x-typesense-api-key: {{local_api_key}}

{"id": "test_slug", "sale": {"type": "EnglishAuction"}}

### update sale to null

POST <http://localhost:8108/collections/minimal_test_collection/documents/import?action=emplace>
content-type: application/json
x-typesense-api-key: {{local_api_key}}

{"id": "test_slug", "sale": null}
###


POST <http://localhost:8108/collections/minimal_test_collection/documents/import?action=emplace>
content-type: application/json
x-typesense-api-key: {{local_api_key}}

{"id": "test_slug", "sale": {"type": "SingleSaleOffer"}}

### 

GET <http://localhost:8108/collections/minimal_test_collection/documents/test_slug>
X-TYPESENSE-API-KEY: {{local_api_key}}

###
POST <http://localhost:8108/multi_search>
content-type: application/json
x-typesense-api-key: {{local_api_key}}

{
    "searches": [
        {
            "collection": "minimal_test_collection",
            "q": "*",
            "filter_by": "sale.type:EnglishAuction",
            "include_fields": "sale.type,id",
            "per_page": 200
        }
    ]
}
with this collection definition we have a bug
Copy code
{
  "name": "minimal_test_collection",
  "enable_nested_fields": true,
  "fields": [
    {
      "name": "id",
      "type": "string"
    },
    {
      "name": "sale",
      "type": "object",
      "optional": true
    },
    {
      "name": "sale.type",
      "type": "string",
      "optional": true,
      "facet": false
    }
  ]
}
but with this definition no bug (the sale object is not defined)
Copy code
{
  "name": "minimal_test_collection",
  "enable_nested_fields": true,
  "fields": [
    {
      "name": "id",
      "type": "string"
    },
    {
      "name": "sale.type",
      "type": "string",
      "optional": true,
      "facet": false
    }
  ]
}
k
Ok got it, I will take another look.
Thanks for tracking this down again!
d
side note. is there any recommendation on whether we should define the sale object or not in the schema. i m not sure i clearly understand what does the object definition in the schema do. this was a relica from our previous way of config where we defined object with fields param. but than we decided on flattening and kept empty object.
k
Defining the parent object in the schema helps discover new children automatically. Otherwise, just mentioning explicit child fields is better.
I've fixed this in
29.0.rc26
d
thanks. i ll test. menwhile we have removed the object definition from our schema. this made the search work proerly with rc25
👍 1
i was not able to reproduce the error. so seems ok for me
k
Thanks for confirming!