Hi! I have a collection with `symbols_to_index: []...
# community-help
d
Hi! I have a collection with
symbols_to_index: []
. I tried dropping and recreating a field to use
symbols_to_index: ["+"]
, but when I search the collection with
q="something+"
, the
+
symbol is not included in the match. Then, I created another collection with
symbols_to_index: ["+"]
, and in this case the
+
symbol was included in the match. Is it possible to set
symbols_to_index
for a single field, or do I need to create a new collection? I am using version 29.0
f
This has been fixed on the latest HEAD of Typesense, but a release candidate isn't yet available. We'll let you know once we have one
d
Ok, thank you! Do you have a rough idea of when the release might be available? I’m asking so I can plan whether to create a workaround with an alias or just wait
f
Not too far, could be as early as MOnday
🙌 1
d
Hi, @Fanis Tharropoulos! Is there already a release or release candidate available for that fix?
f
Try out the RC17 candidate!
1
d
It seems the fix didn’t work. I tried dropping the field and reindexing twice, but the search is still ignoring the “+” character. Here’s how the field is currently configured:
Copy code
{
...
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "title",
      "optional": false,
      "sort": true,
      "stem": false,
      "stem_dictionary": "",
      "store": true,
      "symbols_to_index": [
        "+"
      ],
      "type": "string"
    }
  ],
  "name": "items",
  "num_documents": 1883,
  "symbols_to_index": [],
  "token_separators": []
}
But when I searched, out of 7 results the only one containing the “+” character appears last:
I need to do something different?
a
Hi Davi, I'm not sure we support alter + symbols_to_index, you would need to create another collection with it added and then reindex the documents.
d
Hi Allan, I created a new collection and reindexed all the documents, setting
symbols_to_index
at the field level, using the following schema:
Copy code
{
  "created_at": 1761228770,
  "default_sorting_field": "title",
  "enable_nested_fields": false,
  "fields": [
    {
      "facet": false,
      "index": true,
      "infix": false,
      "locale": "",
      "name": "title",
      "optional": false,
      "sort": true,
      "stem": false,
      "stem_dictionary": "",
      "store": true,
      "symbols_to_index": ["+"]
      "type": "string"
    },
    ...
  ],
  "name": "drug_items",
  "num_documents": 1902,
  "symbols_to_index": [],
  "token_separators": []
}
However, it still doesn’t work. It only works when I set
symbols_to_index
at the collection level. I’m currently using v30.0.rc25.
f
Hey Davi, Could you share a few documents that aren't present in the response for the field-level
symbols_to_index
? We'll test it on a local cluster to debug.
d
Sure. I used the Typesense Cloud interface to search for
"fps 50+"
, and I got seven results. The first one is:
Copy code
{
  "document": {
    "title": "Helioderm Suncare Facial FPS 50"
  },
  "highlight": {
    "title": {
      "matched_tokens": [
        "FPS",
        "50"
      ],
      "snippet": "Helioderm Suncare Facial <mark>FPS</mark> <mark>50</mark>",
      "value": "Helioderm Suncare Facial <mark>FPS</mark> <mark>50</mark>"
    }
  },
  "highlights": [
    {
      "field": "title",
      "matched_tokens": [
        "FPS",
        "50"
      ],
      "snippet": "Helioderm Suncare Facial <mark>FPS</mark> <mark>50</mark>",
      "value": "Helioderm Suncare Facial <mark>FPS</mark> <mark>50</mark>"
    }
  ],
  "text_match": 1157451471441100800,
  "text_match_info": {
    "best_field_score": "2211897868288",
    "best_field_weight": 15,
    "fields_matched": 1,
    "num_tokens_dropped": 0,
    "score": "1157451471441100921",
    "tokens_matched": 2,
    "typo_prefix_score": 0
  }
}
And the last one is:
Copy code
{
  "document": {
    "title": "Isdin Fluid Tattoo FPS 50+"
  },
  "highlight": {
    "title": {
      "matched_tokens": [
        "FPS",
        "50"
      ],
      "snippet": "Isdin Fluid Tattoo <mark>FPS</mark> <mark>50</mark>+",
      "value": "Isdin Fluid Tattoo <mark>FPS</mark> <mark>50</mark>+"
    }
  },
  "highlights": [
    {
      "field": "title",
      "matched_tokens": [
        "FPS",
        "50"
      ],
      "snippet": "Isdin Fluid Tattoo <mark>FPS</mark> <mark>50</mark>+",
      "value": "Isdin Fluid Tattoo <mark>FPS</mark> <mark>50</mark>+"
    }
  ],
  "text_match": 1157451437081362400,
  "text_match_info": {
    "best_field_score": "2211881091072",
    "best_field_weight": 15,
    "fields_matched": 1,
    "num_tokens_dropped": 0,
    "score": "1157451437081362553",
    "tokens_matched": 2,
    "typo_prefix_score": 1
  }
}
However, when I run the same search with
symbols_to_index
defined at the collection level, I get only one result. The one with the title
Isdin Fluid Tattoo FPS 50+