Jonathan Otto
11/08/2022, 7:05 PM{"dbid": 1337, "address": "64a43130af34f9150030f2a2509a9efbd07fe372"}
querying for "000000" returns 4 items in ~200ms (12 cores, 128gb ram, 4x2TB RAID 0)
200ms is pretty decent, but not "amazing". an in-memory ART (adaptive radix trie, which i believe typesense also uses) can return this in a few ms. does 200ms seem in-line with your expectations?Jason Bosco
11/08/2022, 7:07 PMJason Bosco
11/08/2022, 7:07 PMJonathan Otto
11/08/2022, 7:08 PMcurl "<http://localhost:8108/collections/addresses/documents/search?q=000000&query_by=address>"
SCHEMA:
curl "<http://localhost:8108/collections>" \
-X POST \
-H "Content-Type: application/json" '{
"name": "addresses",
"fields": [
{"name": "dbid", "type": "int64" },
{"name": "address", "type": "string" }
],
"default_sorting_field": "dbid"
}'
Jonathan Otto
11/08/2022, 7:11 PM0.24.0.rcn28
, i meant to try outside of docker but haven't yet)Jason Bosco
11/08/2022, 7:12 PMnum_typos=0 & typo_tokens_threshold=0 & drop_tokens_threshold=0 & prioritize_exact_match=false & highlight_fields=none
(space added for readability)
and see if that makes a difference performance-wiseJason Bosco
11/08/2022, 7:13 PMJason Bosco
11/08/2022, 7:13 PMJonathan Otto
11/08/2022, 7:15 PMJonathan Otto
11/08/2022, 7:52 PMtypesense-server-0.23.1-linux-amd64.tar.gz
that's with a fresh data directory, new index, and restart after creating index. surprising resultJason Bosco
11/08/2022, 7:53 PMJason Bosco
11/08/2022, 7:54 PMJason Bosco
11/08/2022, 7:55 PMGET /metrics.json
?Jonathan Otto
11/08/2022, 7:56 PM{
"system_cpu10_active_percentage": "0.00",
"system_cpu11_active_percentage": "9.09",
"system_cpu12_active_percentage": "0.00",
"system_cpu13_active_percentage": "9.09",
"system_cpu14_active_percentage": "0.00",
"system_cpu15_active_percentage": "9.09",
"system_cpu16_active_percentage": "0.00",
"system_cpu17_active_percentage": "10.00",
"system_cpu18_active_percentage": "0.00",
"system_cpu19_active_percentage": "9.09",
"system_cpu1_active_percentage": "27.27",
"system_cpu20_active_percentage": "0.00",
"system_cpu21_active_percentage": "0.00",
"system_cpu22_active_percentage": "0.00",
"system_cpu23_active_percentage": "0.00",
"system_cpu24_active_percentage": "0.00",
"system_cpu2_active_percentage": "25.00",
"system_cpu3_active_percentage": "10.00",
"system_cpu4_active_percentage": "10.00",
"system_cpu5_active_percentage": "0.00",
"system_cpu6_active_percentage": "9.09",
"system_cpu7_active_percentage": "9.09",
"system_cpu8_active_percentage": "9.09",
"system_cpu9_active_percentage": "0.00",
"system_cpu_active_percentage": "6.10",
"system_disk_total_bytes": "7610737090560",
"system_disk_used_bytes": "3837115981824",
"system_memory_total_bytes": "134997864448",
"system_memory_used_bytes": "71718522880",
"system_network_received_bytes": "0",
"system_network_sent_bytes": "0",
"typesense_memory_active_bytes": "11111964672",
"typesense_memory_allocated_bytes": "11072338904",
"typesense_memory_fragmentation_ratio": "0.00",
"typesense_memory_mapped_bytes": "11397263360",
"typesense_memory_metadata_bytes": "226870128",
"typesense_memory_resident_bytes": "11111964672",
"typesense_memory_retained_bytes": "1533775872"
}
Jason Bosco
11/08/2022, 7:58 PM000000
? Could you try a random set of other strings to see if it’s consistent?Jonathan Otto
11/08/2022, 7:58 PMJonathan Otto
11/08/2022, 7:59 PMJonathan Otto
11/08/2022, 7:59 PMJonathan Otto
11/08/2022, 8:01 PMJason Bosco
11/08/2022, 8:01 PMJonathan Otto
11/08/2022, 8:01 PMJonathan Otto
11/08/2022, 8:01 PMJonathan Otto
11/08/2022, 8:01 PMJonathan Otto
11/08/2022, 8:01 PMKishore Nallan
11/09/2022, 1:38 AMJonathan Otto
11/09/2022, 1:50 AM00000
only had 4 matches in 50 million documents so it may not be due to that but i acknowledge your point)
i'm surprised and impressed that raw hex strings worked so well with typesense. most other search libraries couldn't handle it