Hi all, our documents are structured as source and...
# random
m
Hi all, our documents are structured as source and translated documents. When someone searches, we need to search across the source and their translations However, when the result is returned, we need it to return both the source and all its translated documents (even the one not in the match), and count a set them as 1 document Is this possible within Typesense? All the related documents have a relationId field which is the same
j
I’d recommend putting all translations inside the same document in Typesense, when indexing
Eg:
{ fieldA_en, fieldA_fr, fieldA_de, fieldB_en, fieldB_fr, fieldB_de }
m
Thanks Jason…I was trying to avoid it because the data sync from DB would not be just a single row updates …but maybe that’s the only way
s
We rolled out this implementation too
title_.*
By using wildcards we could expand to future locales without major collection changes 🙂
m
Oh this is great…makes things so much better …can i index it as such?
s
You then index
title_en
or
title_es
and a new field is generated.
And we have some logic to query Typesense by the locale the user requires
m
How can I pass different locales when I define fields with regex ? Eg: text_chinese needs zh tokenizer , but text_en needs a different one
We can always define new collection and start using that when we add a new language, however that would mean a complete reindexing
s
For those you would need to define them specifically
Order matter, so you could to
Copy code
title_ja -> ja
title_zh -> zh
title_.* -> generic
👍 1
m
Okay…seeing that there are only a handful of tokenizers currently, we can do a comprehensive one without much overhead
Thanks a tonne @Sergio
s
Currently we rebuild the whole collection when adding fields, and re index the whole database. There is an option to add a field to the collection, but still requires indexing all the data. Since there is no "collection migration management" we just avoid conflicts by recreating everything and then moving the alias.