I've not had a chance to look into this in more detail. The sorting of string fields on Typesense works on byte-order. So it only works reliably with ASCII since other languages are represented as multi-byte utf-8 sequences so relying on byte order won't work for them.
We don't use the built-in C++
std::sort()
because we have to store the strings already in sorted order for efficiency reasons.