Hi team, I think we discovered a bug with field le...
# community-help
a
Hi team, I think we discovered a bug with field level
token_separation
. Typesense version 29.0. Here’s the field excerpt from the schema:
Copy code
{
  "facet": true,
  "index": true,
  "infix": false,
  "locale": "",
  "name": "contactPersonEmail",
  "optional": true,
  "sort": false,
  "stem": false,
  "stem_dictionary": "",
  "store": true,
  "token_separators": [
    ".",
    "-",
    "_",
    "@"
  ],
  "type": "string"
}
And the collection-level token separators:
Copy code
"token_separators": [
    ".",
    "-",
    "_"
]
When executing this query (q=“example”):
Copy code
GET /collections/<collection>/documents/search?filter_by=&q=example&query_by=contactPersonEmail HTTP/1.1
We notice that the response highlights the wrong token (org, net, com instead of example).
Copy code
"highlight": {
    "contactPersonEmail": {
        "matched_tokens": [
            "org"
        ],
        "snippet": "269489-keshawn@example.<mark>org</mark>"
    }
}
Is there something we’re doing wrong either with the query or shema? Or is there a bug in the latest version? Apologies if this was already reported
f
Are you using Typesense cloud? If so, could you share your cluster ID?
a
Hi Fani! This was tested with the docker image.
🙌 1
f
Could you re-create the behavior using a similar example to this one? https://gist.github.com/jasonbosco/7c3432713216c378472f13e72246f46b
a
Ofc!
f
a
Amazing! Thank you Fani 🙏
In which release should we expect the fix to land? Any idea of an approximate timeline?
f
We've merged it to v30, so the next RC build will be the one that includes it. I'll report back when re have a release
🙌 1
🫶 1