Issues with Schema Creation and Nested Fields
TLDR Sean encountered a problem with schema creation involving auto nested fields. Kishore Nallan suggested checking the API response for errors and adding problematic fields to the schema as optional. Sean confirmed the advice.
1
Oct 02, 2023 (2 months ago)
Sean
04:37 AMMy code is set to index batches of 1000 documents at a time using the batchsize field
```client.collections[collName].documents.import(records, {'action': 'upsert', 'batch_size': 1000})
If I run the script for a minute and stop it after 50,000 documents i see 50,000 records using the retrieve function.
When I add `enable_nested_fields: True` to the schema, create a new index and process the same 50,000 records I see a fraction of those listed in the retrieve call ; `'num_documents': 3385` Does my schema look correct?
my_schema = {"name": collName,
"enable_nested_fields": True,
"fields": [
{
'name' : 'title',
'type' : 'string',
'optional': True
},
{
'name' : 'description',
'type' : 'string',
'optional': True
},
{
'name' : 'product_type',
'type' : 'string',
'facet' : True,
'optional': True
},
{
'name' : 'vendor',
'type' : 'string',
'facet' : True,
'optional': True
},
{
'name' : 'sid',
'type' : 'int64',
'optional': True
},
{
'name' : 'tags',
'type' : 'string[]',
'facet' : True,
'optional': True
},
{
'name' : 'url',
'type' : 'string',
'index': False,
'optional': True
},
{
'name' : 'handle',
'type' : 'string',
'index': False,
'optional': True
},
{
'name' : 'last-updated',
'type' : 'int32',
'optional': True
},
{
'name' : 'missing',
'type' : 'bool',
'optional': True
},
{
"name" : "categories",
"type" : "string[]",
"optional": True
},
{"name": ".*", "type": "auto" }
]
}
client.collections.create(my_schema)```
Kishore Nallan
04:43 AMSean
05:10 AMError is
error":"Field
variants.option3 must be an array of string.",
Looking at the actual documents I do see that some of them have variants.option3 as null. I don't need to index null values as listed on this typesense article.What is the best practice in this scenario? Drop null fields before inserting or turn them into an empty string?
Kishore Nallan
05:11 AMSean
05:12 AMSean
05:13 AM{"name": ".*", "type": "auto" }
?Kishore Nallan
05:13 AMSean
05:18 AMmy_schema = {
"name": collName,
"enable_nested_fields": True,
"fields": [
{"name": ".*", "type": "auto", "optional": True }
]
}
That's ok though i needed to type it out anyways for faceting.
Kishore Nallan
05:25 AM1
Typesense
Indexed 3015 threads (79% resolved)
Similar Threads
Resolving Auto-Schema and Configuration Issues in Typesense
Narayan was struggling with auto-schema and configuration issues in Typesense. With the help of Kishore Nallan, they understood and solved the problems by adding 'optional' to all nested fields. They will find a way to handle 'None'.
Cold Start Problem with Dynamic Collections
Adrian reported cold start issues with dynamic collections. Jason suggested using wildcard `*` for query_by parameters, upgrading to `0.25.0.rc34`, and clarified conventions. Adrian's issues were resolved but they reported a limitation that will potentially be addressed.
Issue with Embedding Error in Version 0.25.0.rc63
Bill reported a bug in version 0.25.0.rc63 regarding a problem with updating or emplacing a document and receiving an embedding error. This was resolved in version 0.25.0.rc65, but further discussion ensued regarding the function of 'index' in the update feature.
Trouble in Implementing Deeply Nested Search
Anirudh is struggling to implement a two-level nested search. Jason asked for some specific examples to study the issue. Anirudh provided some material, realizing that adding top fields helped but might over-index. Jason then suggested reporting this issue on GitHub.
Troubleshooting Invalid Field Error in Firestore Document Indexing
Darren receives an error when indexing Firestore documents with empty array in "_grades" field. Jason suggests submitting a bug report and manually setting the schema. The user still experiences issues. Kishore Nallan reproduces the bug, but suggests a solution might exist with an explicit 'string[]' type definition. Further investigation is needed.