Duplicate Field Issue in Schema Creation
TLDR gab faced an issue with duplicated fields in their schema. When examined by Kishore Nallan, they found that gab may have accidentally created the duplicates due to wildcard field naming. The potential bug was identified and resolved.
Nov 03, 2021 (27 months ago)
gab
12:38 PMI have created a schema describing once each of my fields. When I retrieve the collection schema using the api, i can see I have one field that is duplicated. Here is a part of the schema.
{
"facet": true,
"index": true,
"name": "craftsman.production_labels.*.*",
"optional": true,
"type": "string[]"
},
{
"facet": false,
"index": true,
"name": "date_updated",
"optional": false,
"type": "int64"
},
{
"facet": true,
"index": true,
"name": "craftsman.production_labels.*.*",
"optional": true,
"type": "string[]"
}
Also I have an error like this when querying by facet: `` Could not find a facet field named
craftsman.production_labels..` in the schema.
``
Kishore Nallan
12:42 PMgab
12:43 PMKishore Nallan
12:44 PMgab
12:59 PMAbout reproduction I tested to create a new collection with only one field (with the schema of the duplicated one). This works fine the field is not duplicated.
gab
01:19 PMgab
01:31 PMSo both envs have reacted the same way.
Kishore Nallan
01:33 PMgab
01:50 PMI will try to check what is exactly sent to the api when I index.
Kishore Nallan
01:54 PMgab
02:10 PMgab
02:11 PM• create the collection
• create the alias
• index the document
Kishore Nallan
02:11 PMgab
02:13 PMKishore Nallan
02:13 PMKishore Nallan
02:26 PM'craftsman.production_labels.*.*': [ 'Natura-Veal' ],
Kishore Nallan
02:28 PM{
name: 'craftsman.production_labels.*.*',
type: 'string[]',
optional: true,
facet: true
},
This means that: "Accept any field name that begins with
craftsman.production_labels.
". When Typesense sees an actual field matching that rule, it creates an entry in the schema with the actual field name and its type.Since the document that is indexed repeats the
.*
stuff in the field name, you end up with a duplicate. Now, we should certainly account for this edge case and not accept a document that contains a field name that duplicates a regexp field definition.Nov 04, 2021 (26 months ago)
gab
07:30 AMgab
07:33 AMTypesense
Indexed 3011 threads (79% resolved)
Similar Threads
Threading Problem During Multiple Collection Creation and Batch Insertion in Typesense
Johan has a problem with creating multiple collections and batch-inserting documents into Typesense, which is returning results from different collections. Kishore Nallan helps troubleshoot the issue and suggests a potential local race condition, which is fixed in a later build.
Trouble in Implementing Deeply Nested Search
Anirudh is struggling to implement a two-level nested search. Jason asked for some specific examples to study the issue. Anirudh provided some material, realizing that adding top fields helped but might over-index. Jason then suggested reporting this issue on GitHub.
Cold Start Problem with Dynamic Collections
Adrian reported cold start issues with dynamic collections. Jason suggested using wildcard `*` for query_by parameters, upgrading to `0.25.0.rc34`, and clarified conventions. Adrian's issues were resolved but they reported a limitation that will potentially be addressed.
Issue with `included_fields` Command in Typesense
SamHendley encountered an issue with the `included_fields` command in Typesense versions 0.23.0 and 0.24.0.rc17. Jason helped identify it as a bug in the 0.24.X version, which was later addressed in release 0.24.0.rcn19.
Discussions on Typesense, Collections, and Dynamic Fields
Tugay shares plans to use Typesense for their SaaS platform and asks about collection sizes and sharding. Jason clarifies Typesense's capabilities and shares a beta feature. They discuss using unique collections per customer and new improvements. Kishore Nallan and Gabe comment on threading and data protection respectively.