Joel Ödlund
11/06/2024, 6:43 PMenable_typos_for_alphanumeric_tokens
parameter.
I do a search for '_Something (e290)_' and i get a hit on Propane refrigerant (R290) which I want to avoid.
For the alpha-numerical identifiers (R290) i want to avoid typo correction, but i cannot get the setting to work.
The only way i can get this query not to match is to turn up the min_len_1typo
parameter
some debug info
I have hybrid search configured, but turned it off with parameters for this example.
Version 27.1
options
{'q': 'Something (e290)', 'query_by': 'display_name,embedding_openai_3', 'prefix': 'false,false', 'filter_by': 'version_id:=24.0.0 && path_normid:=01010000000000 && hidden:=false', 'include_fields': 'normid,display_name', 'vector_query': 'embedding_openai_3:([], distance_threshold:0.0, alpha:0.0)', 'num_typos': 1, 'enable_typos_for_numerical_tokens': 'false', 'enable_typos_for_alpha_numerical_tokens': 'false', 'min_len_1typo': 2, 'min_len_2typo': 7, 'limit': 5, 'prioritize_token_position': 'true'}
response
'{"facet_counts": [], "found": 1, "hits": [{"document": {"display_name": "Propane refrigerant (R290)", "normid": "01010305000000"}, "highlight": {"display_name": {"matched_tokens": ["R290"], "snippet": "Propane refrigerant (<mark>R290</mark>)"}}, "highlights": [{"field": "display_name", "matched_tokens": ["R290"], "snippet": "Propane refrigerant (<mark>R290</mark>)"}], "hybrid_search_info": {"rank_fusion_score": 1.0}, "text_match": 578730054646227065, "text_match_info": {"best_field_score": "1108057784572", "best_field_weight": 15, "fields_matched": 1, "num_tokens_dropped": 1, "score": "578730054646227065", "tokens_matched": 1, "typo_prefix_score": 2}}], "out_of": 38225, "page": 1, "request_params": {"collection_name": "taxonomy:2024-11-06_09-55-35", "first_q": "", "per_page": 5, "q": "Something (e290)"}, "search_cutoff": false, "search_time_ms": 841}'
Jason Bosco
11/06/2024, 9:26 PMJoel Ödlund
11/07/2024, 10:16 AM### Run Typesense via Docker ########################################
export TYPESENSE_API_KEY=xyz
mkdir "$(pwd)"/typesense-data
docker run -p 8108:8108 \
-v"$(pwd)"/typesense-data:/data typesense/typesense:27.1 \
--data-dir /data \
--api-key=$TYPESENSE_API_KEY \
--enable-cors
### Reproduction Steps ###############################################
export TYPESENSE_API_KEY=xyz
curl "<http://localhost:8108/debug>" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"
curl "<http://localhost:8108/collections>" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"name": "companies",
"fields": [
{"name": "company_name", "type": "string" },
{"name": "num_employees", "type": "int32" },
{"name": "country", "type": "string", "facet": true }
],
"default_sorting_field": "num_employees"
}'
curl "<http://localhost:8108/collections/companies/documents/import?action=create>" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-H "Content-Type: text/plain" \
-X POST \
-d '{"id": "124","company_name": "Stark Industries (a123)","num_employees": 5215,"country": "USA"}
{"id": "125","company_name": "Acme Corp","num_employees": 2133,"country": "CA"}'
curl "<http://localhost:8108/multi_search>" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"searches": [
{
"collection": "companies",
"q": "something b123",
"query_by": "company_name",
"enable_typos_for_alphanumeric_tokens": false
}
]
}'
### Documentation ######################################################################################
# Visit the API reference section: <https://typesense.org/docs/27.1/api/collections.html>
# Click on the "Shell" tab under each API resource's docs, to get shell commands for other API endpoints
Joel Ödlund
11/07/2024, 11:08 AMJason Bosco
11/07/2024, 5:41 PMI do not see this infoMay I know which "info" you're referring to?
Joel Ödlund
11/08/2024, 9:29 AMJoel Ödlund
11/08/2024, 9:55 AMJoel Ödlund
11/08/2024, 9:55 AM{"results":[{"facet_counts":[],"found":1,"hits":[{"document":{"company_name":"Stark Industries (a123)","country":"USA","id":"124","num_employees":5215},"highlight":{"company_name":{"matched_tokens":["a123"],"snippet":"Stark Industries (<mark>a123</mark>)"}},"highlights":[{"field":"company_name","matched_tokens":["a123"],"snippet":"Stark Industries (<mark>a123</mark>)"}],"text_match":578730054645710969,"text_match_info":{"best_field_score":"1108057784320","best_field_weight":15,"fields_matched":1,"num_tokens_dropped":1,"score":"578730054645710969","tokens_matched":1,"typo_prefix_score":2}}],"out_of":2,"page":1,"request_params":{"collection_name":"companies","first_q":"something b123","per_page":10,"q":"something b123"},"search_cutoff":false,"search_time_ms":1}]}%
Joel Ödlund
11/28/2024, 7:28 AMKishore Nallan
11/30/2024, 2:38 AMKishore Nallan
12/03/2024, 7:29 AMenable_typos_for_alpha_numerical_tokens