Improving Typesense Query Performance
TLDR Jonathan queried about slower than expected typesense query performance. Jason and Kishore Nallan offered solutions and explanations. After a series of tests, Jonathan found other queries returned results quickly, indicating the issue was specific to the original query.
1
1
1
Nov 08, 2022 (11 months ago)
Jonathan
07:05 PMi've indexed 50 million items as a test, each one looks something like:
{"dbid": 1337, "address": "64a43130af34f9150030f2a2509a9efbd07fe372"}
querying for "000000" returns 4 items in ~200ms (12 cores, 128gb ram, 4x2TB RAID 0)
200ms is pretty decent, but not "amazing". an in-memory ART (adaptive radix trie, which i believe typesense also uses) can return this in a few ms. does 200ms seem in-line with your expectations?
Jason
07:07 PMJason
07:07 PMJonathan
07:08 PMcurl ""
SCHEMA:
curl "" \
-X POST \
-H "Content-Type: application/json" '{
"name": "addresses",
"fields": [
{"name": "dbid", "type": "int64" },
{"name": "address", "type": "string" }
],
"default_sorting_field": "dbid"
}'
Jonathan
07:11 PM0.24.0.rcn28
, i meant to try outside of docker but haven't yet)Jason
07:12 PMnum_typos=0 & typo_tokens_threshold=0 & drop_tokens_threshold=0 & prioritize_exact_match=false & highlight_fields=none
(space added for readability)and see if that makes a difference performance-wise
Jason
07:13 PMJason
07:13 PMJonathan
07:15 PMJonathan
07:52 PMtypesense-server-0.23.1-linux-amd64.tar.gz
that's with a fresh data directory, new index, and restart after creating index. surprising result
Jason
07:53 PMJason
07:54 PMJason
07:55 PMGET /metrics.json
?Jonathan
07:56 PM{
"system_cpu10_active_percentage": "0.00",
"system_cpu11_active_percentage": "9.09",
"system_cpu12_active_percentage": "0.00",
"system_cpu13_active_percentage": "9.09",
"system_cpu14_active_percentage": "0.00",
"system_cpu15_active_percentage": "9.09",
"system_cpu16_active_percentage": "0.00",
"system_cpu17_active_percentage": "10.00",
"system_cpu18_active_percentage": "0.00",
"system_cpu19_active_percentage": "9.09",
"system_cpu1_active_percentage": "27.27",
"system_cpu20_active_percentage": "0.00",
"system_cpu21_active_percentage": "0.00",
"system_cpu22_active_percentage": "0.00",
"system_cpu23_active_percentage": "0.00",
"system_cpu24_active_percentage": "0.00",
"system_cpu2_active_percentage": "25.00",
"system_cpu3_active_percentage": "10.00",
"system_cpu4_active_percentage": "10.00",
"system_cpu5_active_percentage": "0.00",
"system_cpu6_active_percentage": "9.09",
"system_cpu7_active_percentage": "9.09",
"system_cpu8_active_percentage": "9.09",
"system_cpu9_active_percentage": "0.00",
"system_cpu_active_percentage": "6.10",
"system_disk_total_bytes": "7610737090560",
"system_disk_used_bytes": "3837115981824",
"system_memory_total_bytes": "134997864448",
"system_memory_used_bytes": "71718522880",
"system_network_received_bytes": "0",
"system_network_sent_bytes": "0",
"typesense_memory_active_bytes": "11111964672",
"typesense_memory_allocated_bytes": "11072338904",
"typesense_memory_fragmentation_ratio": "0.00",
"typesense_memory_mapped_bytes": "11397263360",
"typesense_memory_metadata_bytes": "226870128",
"typesense_memory_resident_bytes": "11111964672",
"typesense_memory_retained_bytes": "1533775872"
}
Jason
07:58 PM000000
? Could you try a random set of other strings to see if it’s consistent?Jonathan
07:58 PMJonathan
07:59 PMJonathan
07:59 PMJonathan
08:01 PMJason
08:01 PMJonathan
08:01 PMJonathan
08:01 PMJonathan
08:01 PM1
1
Jonathan
08:01 PM1
Nov 09, 2022 (11 months ago)
Kishore Nallan
01:38 AMJonathan
01:50 AM00000
only had 4 matches in 50 million documents so it may not be due to that but i acknowledge your point)i'm surprised and impressed that raw hex strings worked so well with typesense. most other search libraries couldn't handle it
Typesense
Indexed 2776 threads (79% resolved)
Similar Threads
Understanding Indexing and Search-As-You-Type In Typesense
Steven had queries about indexing and search-as-you-type in Typesense. Jason clarified that bulk updates are faster and search-as-you-type is resource intensive but worth it. The discussion also included querying benchmarks and Typesense's drop_tokens_threshold parameter, with participation from bnfd.
Troubleshooting Typesense Document Import Error
Christopher had trouble importing 2.1M documents into Typesense due to memory errors. Jason clarified the system requirements, explaining the correlation between RAM and dataset size, and ways to tackle the issue. They both also discussed database-like query options.
Revisiting Typesense for Efficient DB Indexing and Querying
kopach experienced slow indexing and crashes with Typesense. The community suggested to use batch import and check the server's resources. Improvements were made but additional support was needed for special characters and multi-search queries.
Typesense Capabilities and Troubleshooting Queries
A had issues with refinement lists and analytics in Typesense. Jason provided a possible solution and recommended the analytics widget. They clarified import size limits and helped identify a filter issue in A's query. Upgrade options are in Typesense's roadmap.
Phrase Search Relevancy and Weights Fix
Jan reported an issue with phrase search relevancy using Typesense Instantsearch Adapter. The problem occurred when searching phrases with double quotes. The team identified the issue to be related to weights and implemented a fix, improving the search results.