Thomas De Craemer
08/06/2025, 12:28 PMPOST <http://localhost:8108/collections>
Content-Type: application/json
X-TYPESENSE-API-KEY: xyz
{
"name": "my-index",
"num_documents": 0,
"fields": [
{"name": "content", "type": "string", "optional": false, "index": true, "infix": true, "stem": true, "locale": "nl" }
]
}
###
POST <http://localhost:8108/collections/my-index/documents/import?action=create>
Content-Type: text/plain
X-TYPESENSE-API-KEY: xyz
{"content": "CAD-fiche"}
{"content": "oranje slachtofferfiches"}
{"content": "oranje tassen"}
{"content": "oranje wagens"}
{"content": "meesterlijke informatiefiches"}
{"content": "fache"}
###
POST <http://localhost:8108/multi_search>
Content-Type: application/json
X-TYPESENSE-API-KEY: xyz
{
"searches":
[
{
"collection": "my-index",
"q": "oranje fiche",
"query_by": "content",
"infix": "always"
}
]
}
I expect "*oranje* slachtoffer*fiche*s" to appear as the best match. However, this specific text only appears as the fourth match with a very low text_match score. The matching only happens on "oranje", not on "fiche".
If I do the same test on Algolia, "oranje slachtofferfiches" appears as the first and only match. It correctly matches on "oranje" and "fiche".
Any ideas? Is this a limitation in Typesense? Am I doing something wrong here?
Thanks!jakubnowakowski0002 tak
08/06/2025, 12:32 PMThomas De Craemer
08/06/2025, 12:35 PMThomas De Craemer
08/06/2025, 12:40 PM{
"results": [
{
"facet_counts": [],
"found": 4,
"hits": [
{
"document": {
"content": "CAD-fiche",
"id": "0"
},
"highlight": {
"content": {
"matched_tokens": [
"fiche"
],
"snippet": "CAD-<mark>fiche</mark>"
}
},
"highlights": [
{
"field": "content",
"matched_tokens": [
"fiche"
],
"snippet": "CAD-<mark>fiche</mark>"
}
],
"text_match": 578730123365187700,
"text_match_info": {
"best_field_score": "1108091338752",
"best_field_weight": 15,
"fields_matched": 1,
"num_tokens_dropped": 1,
"score": "578730123365187705",
"tokens_matched": 1,
"typo_prefix_score": 0
}
},
{
"document": {
"content": "oranje wagens",
"id": "3"
},
"highlight": {
"content": {
"matched_tokens": [
"oranje"
],
"snippet": "<mark>oranje</mark> wagens"
}
},
"highlights": [
{
"field": "content",
"matched_tokens": [
"oranje"
],
"snippet": "<mark>oranje</mark> wagens"
}
],
"text_match": 100,
"text_match_info": {
"best_field_score": "0",
"best_field_weight": 12,
"fields_matched": 4,
"num_tokens_dropped": 2,
"score": "100",
"tokens_matched": 0,
"typo_prefix_score": 255
}
},
{
"document": {
"content": "oranje tassen",
"id": "2"
},
"highlight": {
"content": {
"matched_tokens": [
"oranje"
],
"snippet": "<mark>oranje</mark> tassen"
}
},
"highlights": [
{
"field": "content",
"matched_tokens": [
"oranje"
],
"snippet": "<mark>oranje</mark> tassen"
}
],
"text_match": 100,
"text_match_info": {
"best_field_score": "0",
"best_field_weight": 12,
"fields_matched": 4,
"num_tokens_dropped": 2,
"score": "100",
"tokens_matched": 0,
"typo_prefix_score": 255
}
},
{
"document": {
"content": "oranje slachtofferfiches",
"id": "1"
},
"highlight": {
"content": {
"matched_tokens": [
"oranje"
],
"snippet": "<mark>oranje</mark> slachtofferfiches"
}
},
"highlights": [
{
"field": "content",
"matched_tokens": [
"oranje"
],
"snippet": "<mark>oranje</mark> slachtofferfiches"
}
],
"text_match": 100,
"text_match_info": {
"best_field_score": "0",
"best_field_weight": 12,
"fields_matched": 4,
"num_tokens_dropped": 2,
"score": "100",
"tokens_matched": 0,
"typo_prefix_score": 255
}
}
],
"out_of": 6,
"page": 1,
"request_params": {
"collection_name": "my-index",
"first_q": "oranje fiche",
"per_page": 10,
"q": "oranje fiche"
},
"search_cutoff": false,
"search_time_ms": 1
}
]
}
Alan Martini
08/06/2025, 1:25 PMThomas De Craemer
08/06/2025, 1:27 PMAlan Martini
08/06/2025, 1:33 PMThomas De Craemer
08/06/2025, 1:50 PMThomas De Craemer
08/06/2025, 4:25 PMAlan Martini
08/07/2025, 1:22 PMAlan Martini
08/07/2025, 2:54 PMoranje fiche
will not highlight oranje slachtofferfiches
but searching for fiche oranje
will.
Discussing internally with a colleague, this is actually expected. Infix is an expensive operation and is primarily used for examining identifiers like ID fields or emails/usernames.Thomas De Craemer
08/07/2025, 3:48 PM