Hello ``` searches collection casemanagement cases filter by typesense #community-help

Hello ```{ "searches": [ { "collectio...

Taieb

03/18/2025, 6:00 PM

Hello

Copy code

{
  "searches": [
    {
      "collection": "casemanagement_cases",
      "filter_by": "id:* || $casemanagement_case_category_relations(id:*)",
      "include_fields": "$casemanagement_users(Name, strategy: merge) as AgentName,$casemanagement_clients(Name, strategy: merge) as ClientName,$queues_queuegroups(Name, strategy: merge) as GroupName, $casemanagement_case_category_relations(CategoryId, strategy: nest_array) as Categories",
      "q": "",
      "query_by": "ClientID",
      "limit": 5,
      "enable_lazy_filter": true,
      "validate_field_names": true
    }
  ]
}

I am doing a pretty complicated join. (not sure if best practice) but this works fine on dataset approx 2 million.. But it is pretty slow about 2 seconds to do this query. Do i miss something? When i remove the filter_by it goes quick about 2ms. The category_relation can have multiple caseIds that i want to join. Any suggestions?

Jason Bosco

03/18/2025, 7:21 PM

Just like in a relational database, doing joins at query time is computationally expensive, compared to pre-joining the data and storing them in a flattened way at indexing time. So there is a performance tradeoff involved when using JOINs.

Taieb

03/18/2025, 7:47 PM

When i remove this segment, it is quick again

Copy code

"filter_by": "id:* || $casemanagement_case_category_relations(id:*)",

Taieb

03/18/2025, 7:47 PM

@Jason Bosco

Taieb

03/18/2025, 7:50 PM

I understand that, i was hoping to not have to do it that's why i tried. But agree, will try to flatten it

Taieb

03/18/2025, 10:05 PM

bit dissapointed, i now normalized the data, so there's no joins. But some filters, and it is slower than mariadb with the same amount of data de-normalized...

Copy code

searchParameters {
  q: '*',
  query_by: 'ClientID',
  sort_by: 'Created:DESC',
  filter_by: 'InternalType:=[call,callback,chat,email,form,sms,social] && GroupId:=[135,162,383,445,584,585,615,616,0] && Status:=[Closed,Open]',
  page: 1,
  limit: 5
}

Do i do anything wrong or is this expected..? @Jason Bosco

Taieb

03/18/2025, 10:07 PM

Copy code

{
                name: schemaName,
                fields: [
                    { name: '.*', type: 'auto' },
                    { name: 'ID', type: 'int32' },
                    { name: 'CaseID', type: 'string', sort: true },
                    {
                        name: 'GroupName',
                        type: 'string',
                        optional: true,
                        sort: true,
                    },
                    { name: 'Status', type: 'string', sort: true },
                    { name: 'Created', type: 'string', sort: true },
                    { name: 'SLA', type: 'string', sort: true },
                ],
            }

Jason Bosco

03/19/2025, 1:45 AM

In your filter_by, could you try changing all the

:=

Taieb

03/19/2025, 8:02 AM

Will try that, i'll let you know how it went

Taieb

03/19/2025, 8:12 AM

Ok that made it faster, thanks! Any other tips and trix to make it faster? 🙂 🙂 @Jason Bosco

Harpreet Sangar

03/19/2025, 9:41 AM

I'm wondering about the Join's performance. @Taieb can you share the schemas of the collections you had prior to denormalization?

Taieb

03/19/2025, 9:41 AM

Sure 1 sec

Taieb

03/19/2025, 9:42 AM

Copy code

export function schema(schemaName: string): CollectionCreateSchema {
    switch (schemaName) {
        case 'casemanagement_cases':
            return {
                name: schemaName,
                fields: [
                    { name: '.*', type: 'auto' },
                    { name: 'ID', type: 'int32' },
                    {
                        name: 'UserId',
                        type: 'int32',
                        reference: 'casemanagement_users.UserId',
                        async_reference: true,
                    },
                    {
                        name: 'ClientID',
                        type: 'string',
                        reference: 'casemanagement_clients.ClientID',
                        async_reference: true,
                    },
                ],
                default_sorting_field: 'ID',
            };
        case 'casemanagement_clients':
            return {
                name: schemaName,
                fields: [
                    { name: '.*', type: 'auto' },
                    { name: 'ID', type: 'int32' },
                    {
                        name: 'ClientID',
                        type: 'string',
                    },
                ],
                default_sorting_field: 'ID',
            };
        case 'casemanagement_users':
            return {
                name: schemaName,
                fields: [
                    { name: '.*', type: 'auto' },

                    {
                        name: 'UserId',
                        type: 'int32',
                    },
                ],
            };

        default:
            return {
                name: schemaName,
                fields: [{ name: '.*', type: 'auto' }],
            };
    }
}

It was dynamically created, will this help you?

Harpreet Sangar

03/19/2025, 9:44 AM

What was the definition of

casemanagement_case_category_relations

collection?

Taieb

03/19/2025, 9:50 AM

sorry i missed that one sec

Taieb

03/19/2025, 9:51 AM

I have removed the code

Taieb

03/19/2025, 9:51 AM

will try to recreate it

Taieb

03/19/2025, 9:53 AM

casemanagement_case_category_relations had a relation to casemanagement_cases CaseID ON CaseIDRef

Taieb

03/19/2025, 9:53 AM

I don't have the code

Harpreet Sangar

03/19/2025, 9:53 AM

Okay, no issue. I'm just looking for reference fields.

Taieb

03/19/2025, 9:53 AM

Ok great!

Taieb

03/19/2025, 10:00 AM

any tips on how to get this quicker? @Harpreet Sangar

Copy code

searchParameters {
  q: '*',
  query_by: 'ClientID',
  sort_by: 'Created:DESC',
  filter_by: 'InternalType:=[call,callback,chat,email,form,sms,social] && GroupId:=[135,162,383,445,584,585,615,616,0] && Status:=[Closed,Open]',
  page: 1,
  limit: 5
}

Fields:

Copy code

[
                    { name: 'ID', type: 'int32', index: false, optional: true },
                    { name: 'Source', type: 'string', sort: true, index: false, optional: true },
                    { name: 'CaseID', type: 'string', sort: true, index: false, optional: true },
                    { name: 'ClientID', type: 'string', index: false, optional: true },
                    { name: 'Author', type: 'string', sort: true, index: true, optional: true },
                    { name: 'Parent', type: 'string', index: false, optional: true },
                    { name: 'Status', type: 'string', sort: true, index: true, optional: true },
                    { name: 'Topic', type: 'string', sort: true, index: true, optional: true },
                    { name: 'Description', type: 'string', index: false, optional: true },
                    { name: 'UserId', type: 'int32', index: false, optional: true },
                    { name: 'SLA', type: 'string', sort: true, index: false, optional: true },
                    { name: 'Answered', type: 'int32', optional: false },
                    { name: 'Responded', type: 'int32', index: false, optional: true },
                    { name: 'InternalCreated', type: 'int32', index: false, optional: true },
                    { name: 'MsgID', type: 'string', index: false, optional: true },
                    { name: 'CaseRespondant', type: 'string', index: false, optional: true },
                    { name: 'AssignedTo', type: 'string', index: false, optional: true },
                    // { name: 'HasAttachment', type: 'int32', index: false, optional: true },
                    { name: 'InternalType', type: 'string', index: true, optional: true },
                    { name: 'Language', type: 'string', index: false, optional: true },
                    { name: 'GroupId', type: 'int32', sort: false, index: true, optional: true },
                    { name: 'Group', type: 'string', index: false, optional: true },
                    { name: 'Result', type: 'string', index: false, optional: true },
                    { name: 'Tries', type: 'int32', index: false, optional: true },
                    { name: 'TextField', type: 'string', index: false, optional: true },
                    { name: 'Text', type: 'string', index: false, optional: true },
                    { name: 'UnreadMessages', type: 'int32', index: false, optional: true },
                    { name: 'DateTime', type: 'string', index: false, optional: true },
                    { name: 'Created', type: 'string', sort: true, index: true, optional: true },
                    { name: 'LastIncoming', type: 'string', index: false, optional: true },
                    { name: 'LastUpdated', type: 'string', index: false, optional: true },
                    { name: 'LastUpdatedBy', type: 'int32', index: false, optional: true },
                    { name: 'Resolved', type: 'int32', sort: true, index: false, optional: true },
                    { name: 'ResolvedAt', type: 'string', sort: true, index: false, optional: true },
                    { name: 'ResolvedBy', type: 'string', sort: true, index: false, optional: true },
                    
                    { name: 'AgentCommentTime', type: 'string', sort: true, index: false, optional: true },
                    { name: 'InternalSubType', type: 'string', sort: true, index: false, optional: true },
                    { name: 'Outgoing', type: 'bool', sort: true, index: true, optional: false },
                    { name: 'IsChild', type: 'int32', sort: true, index: false, optional: true },
                    { name: 'ToEmail', type: 'string', sort: true, index: false, optional: true },
                    { name: 'CopyToEmail', type: 'string', sort: true, index: false, optional: true },
                    { name: 'HiddenCopyToEmail', type: 'string', sort: true, index: false, optional: true },
                    { name: 'Signature', type: 'string', sort: true, index: false, optional: true },
                    { name: 'AgentName', type: 'string', sort: true, index: true, optional: true },
                    { name: 'GroupName', type: 'string', sort: true, index: false, optional: true },
                    { name: 'ClientName', type: 'string', sort: true, index: true, optional: true },
                    { name: 'ClientPersonNr', type: 'string', sort: true, index: false, optional: true },
                    { name: 'ContactName', type: 'string', sort: true, index: false, optional: true },
                    { name: 'BrandId', type: 'int32', sort: false, index: false, optional: true },
                    { name: 'QueueIcon', type: 'string', sort: false, index: false, optional: true },
                    { name: 'CategoryCount', type: 'int32', sort: false, index: false, optional: true },
                    { name: 'SubCasesCount', type: 'int32', sort: false, index: false, optional: true },
                    { name: 'GroupIdNotNull', type: 'int32', sort: false, index: false, optional: true }
                  ]

Harpreet Sangar

03/19/2025, 10:06 AM

There's no option to make it quicker with wildcard search (q: *). When you're querying a field however, you can mention

enable_lazy_filtering: true

when your

filter_by

matches too many ids in comparison to the query.

Harpreet Sangar

03/19/2025, 10:06 AM

> There's no option to make it quicker with wildcard search (q: *). (apart from changing

:=

as Jason mentioned)

Harpreet Sangar

03/19/2025, 10:08 AM

What is the value of

search_time_ms

in the response?

Taieb

03/19/2025, 10:08 AM

Will check 1 sec

Taieb

03/19/2025, 10:09 AM

Copy code

search_time_ms: 114

Taieb

03/19/2025, 10:09 AM

Mariadb has around 30ms with the same amount but with complex joins

Taieb

03/19/2025, 10:12 AM

There's no option to make it quicker with wildcard search (q: *).

Any other way to solve it?

Taieb

03/19/2025, 10:13 AM

Copy code

result {
  facet_counts: [],
  found: 747037,
  hits: [
    { document: [Object], highlight: {}, highlights: [] },
    { document: [Object], highlight: {}, highlights: [] },
    { document: [Object], highlight: {}, highlights: [] },
    { document: [Object], highlight: {}, highlights: [] },
    { document: [Object], highlight: {}, highlights: [] }
  ],
  out_of: 2510122,
  page: 11,
  request_params: {
    collection_name: 'casemanagement_case_summary_view',
    first_q: '*',
    per_page: 5,
    q: '*'
  },
  search_cutoff: false,
  search_time_ms: 133
}

Harpreet Sangar

03/19/2025, 10:15 AM

I'll need your dataset with just

InternalType

GroupId

Status

, and

Created

fields to see if there's room for improvement somewhere.

Taieb

03/19/2025, 10:17 AM

Copy code

# InternalType, GroupId, Status, Created
'chat', '135', 'Closed', '2025-02-15 17:42:48'
'email', '162', 'Closed', '2025-02-16 08:50:09'
'chat', '135', 'Closed', '2025-02-16 11:50:03'
'email', '135', 'Closed', '2025-02-17 15:42:02'
'email', '135', 'Open', '2025-02-17 15:45:01'
'email', '135', 'Closed', '2025-02-17 15:49:02'
'chat', '383', 'Closed', '2025-02-21 13:25:21'
'chat', '135', 'Closed', '2025-02-26 10:45:58'
'chat', '135', 'Closed', '2025-02-26 14:40:49'
'chat', '135', 'Closed', '2025-02-26 15:07:36'

Taieb

03/19/2025, 10:17 AM

GroupID is int32

Taieb

03/19/2025, 10:17 AM

The full dataset or is this snippet enough?

Harpreet Sangar

03/19/2025, 10:18 AM

The full dataset. As a jsonl file, if possible.

Taieb

03/19/2025, 10:18 AM

hmm

Taieb

03/19/2025, 10:18 AM

it's quite big, but i will try

✅ 1

Taieb

03/19/2025, 10:20 AM

My workbench crashes when i try to dump the data

Harpreet Sangar

03/19/2025, 10:20 AM

If you've indexed the data in Typesense, you can try the export endpoint.

Taieb

03/19/2025, 10:23 AM

ok trying that now

Harpreet Sangar

03/19/2025, 10:23 AM

What Typesense version are you on?

Taieb

03/19/2025, 10:24 AM

Copy code

typesense/typesense:28.0

✅ 1

Taieb

03/19/2025, 10:28 AM

Fyi, the export is running

Taieb

03/19/2025, 10:36 AM

message has been deleted

Taieb

03/19/2025, 10:36 AM

Is that ok?

Harpreet Sangar

03/19/2025, 10:49 AM

Yes

✅ 1

Taieb

03/19/2025, 1:11 PM

Any ideas? 🙂

Harpreet Sangar

03/19/2025, 1:17 PM

I'm working on a bug right now. I'll let you know when I'm available to look into this.

Taieb

03/19/2025, 1:18 PM

ok thanks

Taieb

03/20/2025, 12:22 PM

@Harpreet Sangar -> How's it going with the bug? 🙂

Harpreet Sangar

03/21/2025, 5:08 AM

Figured out the issue, just working on the best possible solution.

Taieb

03/21/2025, 7:42 AM

really? Cool!

Taieb

03/24/2025, 8:36 AM

@Harpreet Sangar i abandoned typesense for meilisearch you don't have to look at it, thanks anyways 🙂

Vincent Giersch

05/18/2025, 2:30 PM

I think I’ve got the same performance issue, as soon as I add an inner join like the one above (1-N relation), the query time goes from a few ms to at least 1 second, and adding another 1-1 join below that inner join makes the query ~3s. I’m running tests on an empty bare metal server with 128GB memory and data is on an NVMe, so plenty of resources for these tests - running typesense v28 or v29-rc23. ✅ The 1-1 relation that run in a few ms is between 2 collections of 18M and 6M documents so that’s nice. ❌ The 1-3 seconds 1-N query is between 18M and 2M documents. It’s actually much slower than running the join manually with multiple queries on mongodb on a small VM (total 100ms) I can write more details in a new thread after running more detailed tests. https://typesense-community.slack.com/archives/C01P749MET0/p1742327251756489?thread_ts=1742320842.862919&channel=C01P749MET0&message_ts=1742327251.756489

Vincent Giersch

05/18/2025, 2:51 PM

I meant a left join* (https://typesense.org/docs/27.1/api/joins.html#left-join)

Harpreet Sangar

05/19/2025, 7:24 AM

@Vincent Giersch Can you share the schema of your collections?

Vincent Giersch

05/19/2025, 11:34 AM

sure, I will try to make a minimal repro

🙌🏼 1

3 Views

Open in Slack

Previous Next