Lilly Jeba
08/22/2025, 6:00 AMMake: Aion
should restrict Model
to only those related to Aion like Model: SS
, and further restrict Model Year
to only valid years like 2017
, and so on.
Initially, we solved this by flattening all possible combinations per idea into separate records — for example:
{
"make": "Aion",
"model": "SS",
"modelyear": "2017",
"variant": "Knight Edition",
"powertrain": "Hybrid",
"projectcode": "JUN23V1"
}
But this leads to combinatorial explosion. Just 80 ideas generate 8 million records due to all the nested combinations. At scale (e.g., 500+ ideas), this becomes unmanageable in terms of performance and indexing time.
To solve this, we changed the data structure to store each field as an array (stringified or native), like this:
"projects": [
{
"attribute_make": "xxxxx",
"attribute_model": "xxx",
"attribute_modelYear": "2023",
"whereapplied": "Eureka Benchmark Set"
},
{
"attribute_make": "xxxxx",
"attribute_model": "xxx",
"attribute_modelYear": "2021",
"whereapplied": "Idea applies"
}
],
This reduced the number of documents significantly and indexing works fine. However, we now face a problem: cascading filters stop working. For example, when filtering on make: Aion
, the model
facet still shows unrelated models like Dolphin
or Test_001
. Even in the Typesense Dashboard, the facet_by
results don’t reflect correct cascading behavior.
❓ Our core question:
Does Typesense fully support filtering on array fields ([]
) in a way that preserves correct interdependent facet values for cascading filters?
Or is full flattening of data (with one combination per record) still the only way to get cascading filter behavior working properly?
Given our scale, flattening leads to massive performance and UX problems. We’re looking for the best practice or architectural guidance for handling complex facet relationships in a scalable and performant way using Typesense + InstantSearch.
Any recommendations or proven approaches would be really appreciated!Fanis Tharropoulos
08/22/2025, 12:32 PMLilly Jeba
08/22/2025, 1:08 PMstring[]
) for most of our 33 facets (mostly string array types).
However, we’ve run into issues, especially when it comes to nested filtering.
Example Scenario:
We have a nested structure like this:
"projects": [
{
"attribute_make": "xxxxx",
"attribute_model": "xxx",
"attribute_modelYear": "2023",
"whereapplied": "Applied B"
},
{
"attribute_make": "xxxxx",
"attribute_model": "xxx",
"attribute_modelYear": "2021",
"whereapplied": "Applied A"
}
]
When we filter for:
• attribute_modelYear = "2023"
• whereapplied = "Applied A"
We expect no match, because there’s no single object in the array that matches both conditions at once.
However, with the string array transformation (i.e., when modelYear and whereapplied are stored as string[]
across the entire collection), the filter logic seems to work in an OR-like fashion across the nested array — returning any match of either condition, regardless of whether they occur together in the same object.
This results in incorrect matches — for example, returning both Applied A
and Applied B
under model year 2023
.
Question:
With this change to string[] for performance, is it still possible to apply filters that behave like nested (per-object) filters in a flattened JSON structure?
How can we maintain the same filtering accuracy — where both filters must match within the same object of the array — when we switch to using arrays?
Any suggestions on how to handle this more effectively without losing performance?Jason Bosco
08/25/2025, 9:08 PMLilly Jeba
08/28/2025, 5:34 AMstring[]
arrays and object [] for performance optimization.
📌 Use Case Overview
We previously stored our data in a flattened format, where each record represented a unique, tightly scoped combination of values (e.g. System
, Cluster
, Make
, Model
, ModelYear
, etc.). This ensured that any filter applied across those fields would yield only valid combinations.
Now, with our optimized string[]
format, we store multiple values in arrays within a single record — improving performance, but breaking the contextual link between values.
✅ Working Behavior (Flattened Data Model)
Example 1 — Valid record:
{
"systemname": "Body System",
"clustername": "Cluster_ Test",
"attribute_make": "Alfa Romeo",
"attribute_model": "Stelvio",
"attribute_modelyear": "2019"
}
Example 2 — Another valid record:
{
"systemname": "Glasses System",
"clustername": "Overhead System",
"attribute_make": "Alfa Romeo",
"attribute_model": "FV",
"attribute_modelyear": "2016"
}
When filters are applied — e.g.
System = Body System
AND Cluster = Cluster_ Test
—
only Example 1 is returned, as expected.
❌ Broken Behavior (String Array Model)
Optimized version stores them like:
{
"systemname": [
"Body System",
"Glasses System"
],
"clustername": [
"Cluster_ Test",
"Overhead System"
],
"attribute_make": "Alfa Romeo",
"attribute_model": [
"Stelvio",
"FV"
],
"attribute_modelyear": [
"2019",
"2016"
]
}
This now causes a critical issue:
When we filter:
• System = Body System
• Cluster = Cluster_ Test
The record is returned correctly, but with extra values — i.e. Overhead System
and Glasses System
— even though those combinations are not valid together.
So instead of showing only the valid combination:
...we also end up showing:Body System + Cluster_ Test
Body System + Overhead System
This is misleading in the UI and breaks logical accuracy for end-users. 🧩 What We Need We’re looking for a way to enforce that all filters apply to the same logical combination, even when using string arrays — essentially mimicking the behavior of object-level filters in flattened documents. From the documentation, we understand that v29 supports nested array filtering with syntax like:Glasses System + Cluster_ Test
ingredients.{name:=cheese && concentration:<30}
But this works only for arrays of objects, not arrays of strings. Since our performance optimization requires us to use []string
, we need guidance on:
1. Is there a way to simulate per-object filtering across string arrays?
2. If not, is there a recommended way to structure our data (possibly via nested objects) that still performs well but retains the correct filtering semantics?Jason Bosco
08/29/2025, 3:15 PM