#community-help

Issues with Schema Creation and Nested Fields

TLDR Sean encountered a problem with schema creation involving auto nested fields. Kishore Nallan suggested checking the API response for errors and adding problematic fields to the schema as optional. Sean confirmed the advice.

Powered by Struct AI

1

Oct 02, 2023 (2 months ago)
Sean
Photo of md5-e05aea8aa28522c806911c1b5cde25f0
Sean
04:37 AM
I'm running into a strange issue when I create a schema with auto nested fields.
My code is set to index batches of 1000 documents at a time using the batchsize field
```client.collections[collName].documents.import
(records, {'action': 'upsert', 'batch_size': 1000})

If I run the script for a minute and stop it after 50,000 documents i see 50,000 records using the retrieve function.
When I add `enable_nested_fields: True` to the schema, create a new index and process the same 50,000 records I see a fraction of those listed in the retrieve call ; `'num_documents': 3385` Does my schema look correct?
my_schema = {
"name": collName,
"enable_nested_fields": True,
"fields": [
{
'name' : 'title',
'type' : 'string',
'optional': True
},
{
'name' : 'description',
'type' : 'string',
'optional': True
},
{
'name' : 'product_type',
'type' : 'string',
'facet' : True,
'optional': True
},
{
'name' : 'vendor',
'type' : 'string',
'facet' : True,
'optional': True
},
{
'name' : 'sid',
'type' : 'int64',
'optional': True
},
{
'name' : 'tags',
'type' : 'string[]',
'facet' : True,
'optional': True
},
{
'name' : 'url',
'type' : 'string',
'index': False,
'optional': True
},
{
'name' : 'handle',
'type' : 'string',
'index': False,
'optional': True
},
{
'name' : 'last-updated',
'type' : 'int32',
'optional': True
},
{
'name' : 'missing',
'type' : 'bool',
'optional': True
},
{
"name" : "categories",
"type" : "string[]",
"optional": True
},
{"name": ".*", "type": "auto" }
]
}
client.collections.create(my_schema)```
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
04:43 AM
Check the api response. If some docs are not indexed due to some issues, the errors will be present in the api response
Sean
Photo of md5-e05aea8aa28522c806911c1b5cde25f0
Sean
05:10 AM
Thank you for this. I was looking in the typsesense logs for errors and forgot to check the import response.

Error is error":"Field variants.option3 must be an array of string.", Looking at the actual documents I do see that some of them have variants.option3 as null. I don't need to index null values as listed on this typesense article.

What is the best practice in this scenario? Drop null fields before inserting or turn them into an empty string?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
05:11 AM
Add variants.option3 to schema and set it as optional field
Sean
Photo of md5-e05aea8aa28522c806911c1b5cde25f0
Sean
05:12 AM
Sounds good. I was planning on typing out the full schema anyways.
05:13
Sean
05:13 AM
One other quick question. Does this accept optional as well {"name": ".*", "type": "auto" } ?
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
05:13 AM
Interesting question! I'm not sure can you try it out? I presume it must work.
Sean
Photo of md5-e05aea8aa28522c806911c1b5cde25f0
Sean
05:18 AM
Doesn't appear to work. Schema is
my_schema = {
    "name": collName,
    "enable_nested_fields": True,
    "fields": [
        
        {"name": ".*", "type": "auto", "optional": True }
    ]
}

That's ok though i needed to type it out anyways for faceting.
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
05:25 AM
Thanks for confirming.

1

Typesense

Lightning-fast, open source search engine for everyone | Knowledge Base powered by Struct.AI

Indexed 3015 threads (79% resolved)

Join Our Community

Similar Threads

Resolving Auto-Schema and Configuration Issues in Typesense

Narayan was struggling with auto-schema and configuration issues in Typesense. With the help of Kishore Nallan, they understood and solved the problems by adding 'optional' to all nested fields. They will find a way to handle 'None'.

1

17
2mo

Cold Start Problem with Dynamic Collections

Adrian reported cold start issues with dynamic collections. Jason suggested using wildcard `*` for query_by parameters, upgrading to `0.25.0.rc34`, and clarified conventions. Adrian's issues were resolved but they reported a limitation that will potentially be addressed.

6

39
6mo

Issue with Embedding Error in Version 0.25.0.rc63

Bill reported a bug in version 0.25.0.rc63 regarding a problem with updating or emplacing a document and receiving an embedding error. This was resolved in version 0.25.0.rc65, but further discussion ensued regarding the function of 'index' in the update feature.

5

63
4mo

Trouble in Implementing Deeply Nested Search

Anirudh is struggling to implement a two-level nested search. Jason asked for some specific examples to study the issue. Anirudh provided some material, realizing that adding top fields helped but might over-index. Jason then suggested reporting this issue on GitHub.

2

21
2d

Troubleshooting Invalid Field Error in Firestore Document Indexing

Darren receives an error when indexing Firestore documents with empty array in "_grades" field. Jason suggests submitting a bug report and manually setting the schema. The user still experiences issues. Kishore Nallan reproduces the bug, but suggests a solution might exist with an explicit 'string[]' type definition. Further investigation is needed.

1

22
25mo