#community-help

Internal ID vs Human-readable Names in Typesense

TLDR Vladimir asked for advice on using internal IDs or full strings in Typesense, highlighting pros and cons of both. Jason advised storing full strings for performance and simplicity, despite potential translation issues.

Powered by Struct AI

2

4
3w
Solved
Join the chat
Nov 05, 2023 (3 weeks ago)
Vladimir
Photo of md5-c69dd267688ebbdde448e8e7f775aa80
Vladimir
07:44 PM
I would like some help clarifying how and when to use internal ID values instead of human-readable names with Typesense.

For example, I have a collection with a "customer name" field.
I can store customer names as int identifiers and then remap human-readable customer names on the client.

Pros:
• I can use different labels for different user locales (translation support) without storing different values for different locales in the collection (so less memory consumption)
Cons:
• It is impossible to search within facet values if this field is a facet. Since the number of facet values can be quite high, and they are paged, it is not possible to do a client-side search.
Is there any guidance or best practices on that?

Also, if I, for example, store the same string value in 1kk documents, does the string length affect the RAM consumption, or is it compressed internally?
Nov 06, 2023 (3 weeks ago)
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
04:00 PM
In general I would recommend storing the full strings in Typesense instead of just the ID, especially if you want to use them in some search operations (eg: sorting, faceting, grouping, etc).

Repeated strings are only stored once in the in-memory index, so it "compresses" effectively.

1

Vladimir
Photo of md5-c69dd267688ebbdde448e8e7f775aa80
Vladimir
11:21 PM
Are there any particular benefits to this approach?

It has its own downsides:
• To translate the values, instead of doing simple value→label remapping on the client based on the current locale, we have to store values in all languages in separate fields;
• When such a value is renamed by the user, a complex synchronization logic must be implemented. E.g., a bunch of collections has a "customer" field, and if a user renames a customer, this change should be correctly propagated to all collections that use this field. If the value is stored instead of the label, no changes are required in this case (value→label remapping is done on the client dynamically).
Jason
Photo of md5-8813087cccc512313602b6d9f9ece19f
Jason
11:36 PM
Performance would be the main advantage. You can send requests to Typesense directly and render the full search results page with just data from Typesense instead of having an intermediate post-processing step through your backend.

1