Hi, I have created a schema describing once each o...
# community-help
g
Hi, I have created a schema describing once each of my fields. When I retrieve the collection schema using the api, i can see I have one field that is duplicated. Here is a part of the schema. `{ "facet": true, "index": true, "name": "craftsman.production_labels.*.*", "optional": true, "type": "string[]" }, { "facet": false, "index": true, "name": "date_updated", "optional": false, "type": "int64" }, { "facet": true, "index": true, "name": "craftsman.production_labels.*.*", "optional": true, "type": "string[]" } Also I have an error like this when querying by facet: `` Could not find a facet field named
Copy code
craftsman.prod
uction_labels.*.*
in the schema. ``
k
Hmm I wonder if there is a bug lurking here... Are you able to consistently recreate this issue?
g
First I have the same bug in development and production env. I'm not sure how to reproduce yet
k
I mean, if you create the same collection locally do you have the same problem. Btw, Typesense does have a bug where we don't check for duplicate fields. But if there is only 1 field definition we should not duplicate internally further.
g
Ok I see. I only create the collection with the same schema where the field is described once. About reproduction I tested to create a new collection with only one field (with the schema of the duplicated one). This works fine the field is not duplicated.
What kind of operations could I test that would mutate my schema ?
Another usefull information is my collection is created programatically with the same schema for dev/prod. In dev, I have a local Typesense with Docker and in prod, a Cloud one. Both envs are now in the same state with the duplicated field. So both envs have reacted the same way.
k
Schemas are immutable at the moment in Typesense so I don't see how they can get duplicated this way. We can try restarting Typesense server to see what happens after that.
g
Ok I have somehting. I have created again the whole collection. I had no duplciated fields. I just trigger an indexation of a document and now I have the duplicated field. I will try to check what is exactly sent to the api when I index.
k
Ok that's great. If you can create a gist showing the exact sequence I can also debug and fix.
The correct sequence is: • create the collection • create the alias • index the document
k
Thanks I will take a look. What field gets duplicated here?
g
The duplicated field is "craftsman.production_labels.*.*"
k
šŸ‘
@gab gab Any reason why the indexing document also has wildcards in the field name:
Copy code
'craftsman.production_labels.*.*': [ 'Natura-Veal' ],
In the schema, you have:
Copy code
{
      name: 'craftsman.production_labels.*.*',
      type: 'string[]',
      optional: true,
      facet: true
    },
This means that: "Accept any field name that begins with `craftsman.production_labels.
Copy code
". When Typesense sees an actual field matching that rule, it creates an entry in the schema with the actual field name and its type.

Since the document that is indexed repeats the
.*` stuff in the field name, you end up with a duplicate. Now, we should certainly account for this edge case and not accept a document that contains a field name that duplicates a regexp field definition.
g
Ah ok! I wasn't aware about that wildcard field name. I tought it was handled as a string. I understand now. I was using it as it was convenient for me, I use a framework that use also that kind of syntax to control deepness access.
Thanks for help
j
I also see a duplicate field entry in the schema I'm not sure if this is expected? @Kishore Nallan These are my fields:
Copy code
const fields = [
  {
    name: `title_en`,
    type: 'string*',
    facet: false 
  },
  {
    name: `title_fr`,
    type: 'string*',
    facet: false 
  }
]
I create the schema through the api and when I'm viewing the schema through the Typesense cloud dashboard it gives me back this.
Copy code
[
  {
    "facet": false,
    "index": true,
    "infix": false,
    "locale": "",
    "name": "title_en",
    "optional": true,
    "sort": false,
    "stem": false,
    "type": "string*"
  },
  {
    "facet": false,
    "index": true,
    "infix": false,
    "locale": "",
    "name": "title_en",
    "optional": true,
    "sort": false,
    "stem": false,
    "type": "string"
  },
  {
    "facet": false,
    "index": true,
    "infix": false,
    "locale": "",
    "name": "title_fr",
    "optional": true,
    "sort": false,
    "stem": false,
    "type": "string*"
  },
  {
    "facet": false,
    "index": true,
    "infix": false,
    "locale": "",
    "name": "title_fr",
    "optional": true,
    "sort": false,
    "stem": false,
    "type": "string"
  }
]
Notice the only difference between the duplicated entries is that one has a type of:
string
and the other a type of
string*
(again not sure if this is expected) Also when I'm on the Typesense cloud search page, I see that every document contains 2 title_fr properties and 2 title_en properties.
k
Can you please post on a new thread? This is a 3-year old thread šŸ™‚
But just to answer your question: this is expected. We have a
string*
which is the base schema and then the concrete type
string
which is detected based on the first document indexed. This is expected if you use
string*
as a type in your schema.
j
My bad, thank you