Limiting Returned Array Size and Using Joins in Document Search
TLDR James asked how to limit the returned array size in a document search. Jason suggested breaking the document into multiple records. After discussing join options and providing his collections data, James decided to duplicate the parent information in each child. Harpreet confirmed this approach.
Sep 22, 2023 (2 months ago)
James
08:34 PMJason
08:35 PMIf the size can be this high, you might want to consider breaking this out into multiple records, and then using group_by to fetch one result from each group
James
08:52 PMJames
09:02 PM{
"field1": "abc",
"children": [
{ "field2": "def" },
{ "field3": "geh" },
}
}
I see how this can be done with nesting, but there is the issue of returning too many children. or having to split the document as you suggest. If I separate children into a separate collection, and join them to the parent, I don't see a way to search across both collections at once based on the current specs for joins. But maybe I'm wrong there. Do you see a solution with joining?
Sep 23, 2023 (2 months ago)
Kishore Nallan
01:19 AMCan you please post the parent child collections that you have created and what query you are now trying to do?
James
01:32 PM{
"name": "parent",
"fields": [
{ "name": "parent_id", "type": "string" },
{ "name": "name", "type": "string" },
]
}
{
"name": "children",
"fields": [
{ "name": "child_id", "type": "string" },
{ "name": "name", "type": "string" },
{ "name": "parent_id", "type": "string", "reference": "parent.parent_id" }
]
}
Based on the current join documentation, it doesn't seem like you can use query_by with a reference collection at all (if I try it hangs), but I'd like to do something like:
{
"q": "abc",
"query_by": "name,$parent(name)",
"collection": "children"
}
A simple solution in my case seems to be just to add all parent fields directly to the children, and have a single collection with the children alone. I can then query across all the fields I want and use group by to aggregate and limit on the parent ID
// collection
{
"name": "children",
"fields": [
{ "name": "child_id", "type": "string" },
{ "name": "name", "type": "string" },
{ "name": "parent_id", "type": "string" },
{ "name": "parent_name", "type": "string" }
]
}
// query
{
"q": "abc",
"query_by": "name,parent_name",
"collection": "children",
"group_by": "parent_id",
"group_limit": 5
}
The only downside I think is that I have to duplicate the parent information across many children, but the collection isn't that large so I don't think this is an issue.
Kishore Nallan
01:52 PMSep 25, 2023 (2 months ago)
Harpreet
11:39 AM{
"q": "abc",
"query_by": "name,parent_name",
"collection": "children",
"group_by": "parent_id",
"group_limit": 5
}
parent_id
will be the reference field in the child collection. If you want the rest of the fields of the referenced parent to be included in the response, you can send"include_fields": "$parent(*)"
if there are fields with common names in both the collections, you can specify
"include_fields": "$parent(*) as parent"
so every field of the parent would have
parent.
as a prefix.Typesense
Indexed 3005 threads (79% resolved)
Similar Threads
Querying and Highlighting Nested Objects in Json
Patrick sought help on retrieving more information about highlighted nested objects in a JSON document. Aadarsh suggested a solution but the conversation became complicated due to schema errors identified by Jason.
Discussion on High-cardinality Keys and Auto-schema Generation
Jack expressed concern over high cardinality keys resulting in extra fields in the collection. Jason clarified it's not an extra field but a transformation for matching values. Kishore Nallan assured high number of discovered fields is not an issue and promised to address schema creation for non-indexed objects.
SQL-like JOINs in Multi-search: Query Solution
Ayush asked if SQL-like JOINs can be implemented during a multi-search. Jason noted a separate JOINs feature is in beta but the conversation didn't resolve to a definite solution.
Discussion on Join Feature with Vector Search and Array Fields
Neil asked about the new join feature's compatibility with `object[]` type fields and how to filter by array field count. Harpreet provided solutions.
Issue with Query Expectations on Typesense Search
Sean was having an issue with their search query on Typesense. Kishore Nallan suggested adjusting the 'drop_tokens_threshold' parameter. After making the adjustment, Sean found an issue with the order of the results, which was resolved by updating Typesense version.