#community-help

Defining Nested JSON Schema for Querying

TLDR Greg struggled in defining a schema for querying a JSON object with nested fields. Jason offered solutions, with the final resolution involving the use of "object" as a field type for the contributors in the schema.

Powered by Struct AI
+11
heart1
raised_hands1
tada1
50
12mo
Solved
Join the chat
Sep 29, 2022 (12 months ago)
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
09:34 PM
I asked you about this last week but am still confused. You upgraded my instance to .24 release to support nested objects. My JSON object looks like this:
{
...
    "contributors": {
        "contributor1": {
            "bio": "Something about this author",
            "firstName": "Lewis",
            "lastName": "Carroll",
            "type": "author"
        }
        "contributor2": ...
        "contributorX": ...
    }
...
}

It can have an unknown number of contributors. How do I define the schema for that? I am going to have to query by a contributors first & last name, type, and be able to group by the author.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:36 PM
Greg Are you able to structure that data as an array of objects like this, this will make it much easier to query:

{
...
    "contributors": [
        {
            "bio": "Something about this author",
            "firstName": "Lewis",
            "lastName": "Carroll",
            "type": "author"
        },
        { ... },
        { ... }
    ]
...
}
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
09:37 PM
Not without significant refactor and data migration
09:41
Greg
09:41 PM
It is possible if that is the only solution.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:41 PM
Haven’t tested this, but could you try something like this:

{
  "name": "collection_name",
  "fields": [
    {"name": "contributors\..*.firstName", "type": "auto" },
    {"name": "contributors\..*.lastName", "type": "auto" },
    {"name": "contributors\..*.type", "type": "auto" },
  ]
}
09:42
Jason
09:42 PM
We’re essentially describing fields based on regex
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
09:43 PM
Ok. Trying it now. Thank you
09:47
Greg
09:47 PM
The schema generated successfully:
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "contributors..*.firstName",
      "nested": false,
      "optional": true,
      "sort": false,
      "type": "string"
    },
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "contributors..*.lastName",
      "nested": false,
      "optional": true,
      "sort": false,
      "type": "string"
    },
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "contributors..*.type",
      "nested": false,
      "optional": true,
      "sort": false,
      "type": "string"
    },

However, I can’t query by those fields in the typesense UI. I am going to try it in code.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:48 PM
You would have to index at least one document before those additional fields show up
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
09:48 PM
I did.
09:48
Greg
09:48 PM
I indexed 500 records
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:48 PM
Hmm
09:48
Jason
09:48 PM
Could you refresh the schema page on Typesense Cloud and post the schema it now shows you after indexing?
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
09:49 PM
Let me check my record. I hard coded some records for testing.
09:58
Greg
09:58 PM
Do I need the parent contributors?
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
09:59 PM
Shouldn’t be needed…
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
09:59 PM
This is what the schema looks like:
{
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "contributor..*.firstName",
      "nested": false,
      "optional": true,
      "sort": false,
      "type": "auto"
    },
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "contributor..*.lastName",
      "nested": false,
      "optional": true,
      "sort": false,
      "type": "auto"
    },
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "contributor..*.type",
      "nested": false,
      "optional": true,
      "sort": false,
      "type": "auto"
    },

This is the record in the Typesense :
contributors
{ "contributor1": { "firstName": "Lewis", "lastName": "Carroll", "type": "Author" }, "contributor2": { "firstName": "Greg", "lastName": "Mascherino", "type": "Illustrator" } }
10:00
Greg
10:00 PM
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:00 PM
Btw, forgot to mention this:

{
  "name": "collection_name",
  "enable_nested_fields": true, // <=== this is required
  "fields": [
    {"name": "contributors\..*.firstName", "type": "auto" },
    {"name": "contributors\..*.lastName", "type": "auto" },
    {"name": "contributors\..*.type", "type": "auto" },
  ]
}
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:03 PM
That property is not getting saved.
exports.bookContentsSchema = {
  name: 'book-contents',
  enabled_nested_fields: true,
  fields: [
    {
      facet: false,
      name: 'ageDemographicFrom',
      optional: true,
      type: 'int32',
    },
    {
      facet: false,
      name: 'ageDemographicTo',

Result in Typesense clould:
{
  "created_at": 1664488909,
  "default_sorting_field": "",
  "enable_nested_fields": false,
  "fields": [
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "ageDemographicFrom",
      "nested": false,
      "optional": true,
      "sort": true,
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:04 PM
There was a type in my earlier snippet. It should be enable_nested_fields not enabled_nested_fields
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:05 PM
Sorry. I should have looked closer. My bad. That’s what I get for just copying and pasting.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:06 PM
my bad too!
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:09 PM
No luck.
{
    "code": 404,
    "error": "Could not find a field named `contributors..*.firstName` in the schema."
}
10:10
Greg
10:10 PM
Wait. Another typo.. Should be contributor..*
10:11
Greg
10:11 PM
Still same thing
10:11
Greg
10:11 PM
{
“code”: 404,
“error”: “Could not find a field named contributor..*.firstName in the schema.”
}
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:17 PM
That 404 is during a search right?
10:19
Jason
10:19 PM
If so, that’s expected because during a search you have to specify each field explicitly without a regex. So you would have to do query_by: contributors.contributor1.firstName, contributors.contributor2.firstName,...
10:19
Jason
10:19 PM
This would be the downside of using an object to index that field, you have to explicitly specify each field name
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:21 PM
Got it. That makes sense.
10:26
Greg
10:26 PM
Still same result 😕
10:26
Greg
10:26 PM
{
“code”: 404,
“error”: “Could not find a field named contributors.contributor1.firstName in the schema.”
}
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:26 PM
Could you share the full search query you’re using with all the search params, and also the full schema?
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:27 PM
Schema from Node:
exports.bookContentsSchema = {
  name: 'book-contents',
  enable_nested_fields: true,
  fields: [
    {
      facet: false,
      name: 'ageDemographicFrom',
      optional: true,
      type: 'int32',
    },
    {
      facet: false,
      name: 'ageDemographicTo',
      optional: true,
      type: 'int32',
    },
    {
      facet: false,
      name: 'bisac1',
      optional: true,
      type: 'string',
    },
    {
      facet: false,
      name: 'bisac2',
      optional: true,
      type: 'string',
    },
    {
      facet: false,
      name: 'bisac3',
      optional: true,
      type: 'string',
    },
    {
      facet: false,
      name: 'chunk',
      optional: false,
      type: 'string',
    },
    {
      facet: false,
      name: 'cfi',
      optional: false,
      type: 'string',
    },
    {
      name: 'contributor..*.firstName',
      type: 'auto',
    },
    {
      name: 'contributor..*.lastName',
      type: 'auto',
    },
    {
      name: 'contributor..*.type',
      type: 'auto',
    },
    {
      facet: false,
      name: 'genres',
      optional: true,
      type: 'string[]',
    },
    {
      facet: true,
      name: 'isbn',
      optional: false,
      type: 'string',
    },
    {
      facet: false,
      name: 'keywords',
      optional: true,
      type: 'string[]',
    },
    {
      facet: false,
      name: 'pageCount',
      optional: true,
      type: 'int32',
    },
    {
      facet: false,
      name: 'physicalCopyLink',
      optional: true,
      type: 'string',
    },
    {
      facet: false,
      name: 'publicationDate',
      optional: true,
      type: 'int64',
    },
    {
      facet: false,
      name: 'published',
      optional: true,
      type: 'bool',
    },
    {
      facet: false,
      name: 'subtitle',
      optional: true,
      type: 'string',
    },
    {
      facet: false,
      name: 'thumbnail',
      optional: true,
      type: 'string',
    },
    {
      facet: false,
      name: 'title',
      optional: false,
      type: 'string',
    },
  ],
};
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:27 PM
Actually it might be easier if you can give me permission to access your data on your Typesense Cloud cluster (we require explicit permission before we can access customer data). Would that be ok?
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:27 PM
Yes, you can access it. Cluster ID: ujgk3a7dsmvz1l04p
10:28
Greg
10:28 PM
Here is my front end code:
 const searchParameters = {
      searches: [
        {
          group_by: 'isbn',
          group_limit: 1,
          per_page: 5,
          query_by: 'title',
        },
        {
          group_by: 'isbn',
          group_limit: 1,
          per_page: 5,
          query_by: 'subtitle',
        },
        {
          group_by: 'isbn',
          group_limit: 1,
          per_page: 5,
          query_by: 'keywords',
        },
        {
          group_by: 'isbn',
          group_limit: 1,
          per_page: 5,
          query_by: 'genres',
        },
        {
          group_by: 'isbn',
          group_limit: 1,
          per_page: 5,
          query_by: 'isbn',
        },
        { 
          query_by: 'contributors.contributor1.firstName',
          per_page: 5,
        },
        {
          group_by: 'isbn',
          group_limit: 1,
          per_page: 5,
          query_by: 'chunk',
        },
      ],
    };
    const commonSearchParameters = {
      collection: 'book-contents',
      q: e.target.value,
    };
    typesenseClient.multiSearch
      .perform(searchParameters, commonSearchParameters)
      .then((response) => {
        setSearchResults(response);
      })
      .catch((error) => {
        console.error(error);
      });
10:29
Greg
10:29 PM
All other fields seem to be working.
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:29 PM
There’s a typo in the schema:

"name": "contributor..*.firstName"

Should be: "name": "contributors..*.firstName"
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:30 PM
It is contributor
10:30
Greg
10:30 PM
contributors
{ “contributor1”: { “firstName”: “Lewis”, “lastName”: “Carroll”, “type”: “Author” }, “contributor2": { “firstName”: “Greg”, “lastName”: “Mascherino”, “type”: “Illustrator” } }
10:30
Greg
10:30 PM
I tried both ways
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:30 PM
Actually that shouldn’t make a difference given the regex…
10:30
Jason
10:30 PM
Ok let’s try a different approach. Could you try this schema:

{
  "name": "collection_name",
  "enable_nested_fields": true,
  "fields": [
    {"name": "contributors", "type": "object" }
  ]
}
10:31
Jason
10:31 PM
contributors is the top level field
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:32 PM
I think that worked
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
10:33 PM
Cool, looks like regex and nested fields still needs work
Greg
Photo of md5-11c0f771a29e2aa8d72ae9544cc39017
Greg
10:39 PM
Thank you for the help!
+11
10:44
Greg
10:44 PM
I am glad I decided to switch from Algolia to Typesense. You guys are awesome!!!
heart1
raised_hands1
tada1