Hello everyone! I have an issue I want to share to...
# community-help
d
Hello everyone! I have an issue I want to share to see what you think. Performance degradation on array field queries (~40M docs) We’re seeing that in a collection of ~40M documents with this structure:
Copy code
{
  "enable_nested_fields": true,
  "fields": [
    {
      "name": "extendedNameEsES",
      "type": "string",
      "index": true,
      "store": true
    },
    {
      "name": "productFormatPublicId",
      "type": "string",
      "index": true,
      "store": true
    },
    {
      "name": "productId",
      "type": "string",
      "index": false,
      "store": true,
      "optional": true
    },
    {
      "name": "handlingUnitId",
      "type": "string",
      "index": false,
      "store": true,
      "optional": true
    },
    {
      "name": "sitePublicId",
      "type": "string",
      "index": true,
      "store": true
    },
    {
      "name": "siteId",
      "type": "string",
      "index": false,
      "store": true,
      "optional": true
    },
    {
      "name": "assortmentStatuses.statusId",
      "type": "string[]",
      "index": true,
      "store": true,
      "optional": true
    },
    {
      "name": "assortmentStatuses.startDate",
      "type": "int64[]",
      "index": true,
      "store": true,
      "optional": true
    },
    {
      "name": "assortmentStatuses.endDate",
      "type": "int64[]",
      "index": true,
      "store": true,
      "optional": true
    }
  ],
  "name": "assortments",
  "num_documents": 36259531
}
When we search or filter by the array field:
Copy code
curl --location '<https://localhost:12000/collections/assortments/documents/search?q=lech&query_by=extendedNameEsES&filter_by=sitePublicId:=4420&&currentStatusId:=A>)
As the number of documents increases, response times degrade significantly. Right now, we are seeing ~1 second response time with the attached query, but as we add more filters, the latency increases. Questions: • What workarounds would you recommend in these cases? • We are considering segmenting this collection into ~1600 collections (one per store) with fewer fields. Do you see this as a good approach?
f
Are you using Typesense directly? If so, could you share the
search_time_ms
in the response body?
h
Also, you can check if sending
enable_lazy_filter: true
reduces the response time.
I have tried to test the parameter
enable_lazy_filter: true
but we have not noticed any improvement in times
h
@Diego Chacón Sanchiz
enable_lazy_filter: true
won't have any affect in case of Join. Can you share the relevant field definitions of
products
and
assortments
collections?
d
colecc assortments: { "name": "assortments", "fields": [ { "name": "productFormatPublicId", "type": "string", "facet": false, "optional": false, "index": true, "sort": false, "infix": false, "locale": "", "async_reference": true, "reference": "products.productFormatPublicId", "stem": false, "stem_dictionary": "", "store": true }, { "name": "productId", "type": "string", "facet": false, "optional": true, "index": false, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "handlingUnitId", "type": "string", "facet": false, "optional": true, "index": false, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "sitePublicId", "type": "string", "facet": false, "optional": false, "index": true, "sort": true, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "siteId", "type": "string", "facet": false, "optional": true, "index": false, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "assortmentStatuses", "type": "object[]", "facet": false, "optional": true, "index": false, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "currentStatusId", "type": "string", "facet": false, "optional": true, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "currentStatusStartDate", "type": "int64", "facet": false, "optional": true, "index": true, "sort": true, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "currentStatusEndDate", "type": "int64", "facet": false, "optional": true, "index": true, "sort": true, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "futureStatusId", "type": "string", "facet": false, "optional": true, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "futureStatusStartDate", "type": "int64", "facet": false, "optional": true, "index": true, "sort": true, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "futureStatusEndDate", "type": "int64", "facet": false, "optional": true, "index": true, "sort": true, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true } ], "default_sorting_field": "sitePublicId", "enable_nested_fields": true, "symbols_to_index": [], "token_separators": [] } colec products: { "name": "products", "fields": [ { "name": "productFormatPublicId", "type": "string", "facet": false, "optional": false, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "numericProductFormatPublicId", "type": "int32", "facet": false, "optional": false, "index": true, "sort": true, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "productId", "type": "string", "facet": false, "optional": false, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "extendedNameEuES", "type": "string", "facet": false, "optional": true, "index": true, "sort": false, "infix": false, "locale": "eu", "stem": false, "stem_dictionary": "", "store": true }, { "name": "extendedNameGlES", "type": "string", "facet": false, "optional": true, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "extendedNameCaES", "type": "string", "facet": false, "optional": true, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "extendedNamePtPT", "type": "string", "facet": false, "optional": true, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true }, { "name": "extendedNameEsES", "type": "string", "facet": false, "optional": false, "index": true, "sort": false, "infix": false, "locale": "", "stem": false, "stem_dictionary": "", "store": true } ], "default_sorting_field": "numericProductFormatPublicId", "enable_nested_fields": true } the searchs: { "q": "leche desnatada", "query_by": "extendedNameEsES", "filter_by": "$assortments(sitePublicId=[7088]&amp;&amp;currentStatusId=N)" } "search_time_ms": 1067 { "q": "leche desnatada", "query_by": "extendedNameEsES", "filter_by": "$assortments(sitePublicId:=[7088])" } "search_time_ms": 6
h
Hi @Diego Chacón Sanchiz I am working on reducing the time taken by joins: https://github.com/typesense/typesense/pull/2617 I will let you know when this feature is ready.