Typesense Cloud Search Issue for Large Collections
TLDR Anh-Jo encountered search issues in a large collection. Jason identified the max_candidates
parameter was causing the problem and mentioned the update in 0.24.1
would help.


May 15, 2023 (4 months ago)
Anh-Jo
03:49 PMI got a collection user with some classic field like first name, last name, email, etc... (I also setup in this collection a specific field (equal to the id) to be able to search directly by id (called db_id)), and sometime, search doesn't show the expected result, that is the exact result.
Does anyone have an idea ?
My query is along multiple field ( eg: first_name, last_name, email, username ), the only way to get my exact match is by deleting my db_id field in my search
Anh-Jo
03:54 PMexport TYPESENSE_API_KEY=xyz
curl "" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"name": "user",
"fields": [
{"name": "username", "type": "string" },
{"name": "db_id", "type": "string" },
],
"default_sorting_field": ""
}'
curl "" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-H "Content-Type: text/plain" \
-X POST \
-d '{"id": "1","username": "UserTest","db_id": "1"}
{"id": "2","username": "UserTest1","db_id": "2"}'
curl "" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"searches": [
{
"collection": "user",
"q": "UserTest",
"query_by": "username,db_id"
}
]
}'
but with around 490k users with around 121 name near my expeceted result
Anh-Jo
03:55 PMUserTest
but it doesn't appear, only UserTest1
is gettingJason
03:58 PM➜ ~ curl "" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"name": "user",
"fields": [
{"name": "username", "type": "string" },
{"name": "db_id", "type": "string" }
],
"default_sorting_field": ""
}'
{"created_at":1684166226,"default_sorting_field":"","enable_nested_fields":false,"fields":[{"facet":false,"index":true,"infix":false,"locale":"","name":"username","optional":false,"sort":false,"type":"string"},{"facet":false,"index":true,"infix":false,"locale":"","name":"db_id","optional":false,"sort":false,"type":"string"}],"name":"user","num_documents":0,"symbols_to_index":[],"token_separators":[]}%
➜ ~ curl "" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-H "Content-Type: text/plain" \
-X POST \
-d '{"id": "1","username": "UserTest","db_id": "1"}
{"id": "2","username": "UserTest1","db_id": "2"}'
{"success":true}
{"success":true}%
➜ ~ curl "" \
-X POST \
-H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" \
-d '{
"searches": [
{
"collection": "user",
"q": "UserTest",
"query_by": "username,db_id"
}
]
}' | jq
{
"results": [
{
"facet_counts": [],
"found": 2,
"hits": [
{
"document": {
"db_id": "1",
"id": "1",
"username": "UserTest"
},
"highlight": {
"username": {
"matched_tokens": [
"UserTest"
],
"snippet": "<mark>UserTest</mark>"
}
},
"highlights": [
{
"field": "username",
"matched_tokens": [
"UserTest"
],
"snippet": "<mark>UserTest</mark>"
}
],
"text_match": 578730123365712000,
"text_match_info": {
"best_field_score": "1108091339008",
"best_field_weight": 15,
"fields_matched": 1,
"score": "578730123365711993",
"tokens_matched": 1
}
},
{
"document": {
"db_id": "2",
"id": "2",
"username": "UserTest1"
},
"highlight": {
"username": {
"matched_tokens": [
"UserTest"
],
"snippet": "<mark>UserTest</mark>1"
}
},
"highlights": [
{
"field": "username",
"matched_tokens": [
"UserTest"
],
"snippet": "<mark>UserTest</mark>1"
}
],
"text_match": 578730089005449300,
"text_match_info": {
"best_field_score": "1108074561536",
"best_field_weight": 15,
"fields_matched": 1,
"score": "578730089005449337",
"tokens_matched": 1
}
}
],
"out_of": 2,
"page": 1,
"request_params": {
"collection_name": "user",
"per_page": 10,
"q": "UserTest"
},
"search_cutoff": false,
"search_time_ms": 2
}
]
}
Anh-Jo
03:58 PMJason
03:59 PMAnh-Jo
03:59 PMJason
04:00 PM
Jason
05:11 PM0.23.1
Typesense will take the top 4 prefixes and search based on that for performance reasons. This is controlled by max_candidates
parameter - increasing it returns this record.
Jason
05:11 PM0.24.1
, we’ve increased max_candidates
to 1000
for collections with less than 500K documents, which will also help here.Typesense
Indexed 2764 threads (79% resolved)
Similar Threads
Issues with Typesense Search Results Limit
Adrian is trying typesense for the first time and encounters an issue with search results being limited to 10 hits. Kishore Nallan suggests sending 'max_candidates: 1000' as a search option.

Understanding Search Result Variations with Filtering Parameters
SamHendley faced inconsistencies in the number of documents returned when adding more filter parameters. Jason explained it's due to Typesense limiting the number of variables checked for better performance. Increasing `max_candidates` or enabling `exhaustive_search` can help obtain all values.
Resolving Issues with Search Function in Typesense
Anh-Jo had problems with the search functionality in Typesense using a large JSON file. Kishore Nallan provided a guide that solved the issue.
Optimizing Typesense Implementation for Large Collections
Oskar faced performance issues with his document collection in Typesense due to filter additions. Jason suggested trying a newer Typesense build and potentially partitioning the data into country-wise collections. They also discussed reducing network latency with CDN solutions.



Issues and Improvements in Typesense with 14 Million Records
Miguel experienced performance issues when using Typesense for large datasets. Jason suggested performance improvements made to Typesense since then and directed them to specific server-side parameters for better handling. Miguel agreed to try again.

