Seeking Advice for Report Engine Build on Top of TS

TLDR Dan was seeking advice to build a reporting engine for their specific use case on TS but was facing difficulties. Kishore Nallan recommended using Clickhouse, which has features conducive to Dan's needs.

Powered by Struct AI


Join the chat
Sep 08, 2022 (15 months ago)
Photo of md5-f8f7b04dda4a09dec991cdc07f4d9c64
08:29 PM
I could use some advice! I have been trying to build a reporting engine on top of TS, and it seems, unfortunately, it’s not quite up to our specific use case. We are trying to group by a facet, and then return bucketed counts of facet values that with that value in common.

For example, if I want to see two years+ of data, it would be >730 searches in a multi_search (not even sure that’s doable..). On top of that, each search has to be filtered by a list of facet values, potentially in the 100s of thousands of values… This is clearly not going to work, but was a fun exercise to see if it could be done 🙂

The kind of data we have looks like this:
  "activityKind_facet": "activity.platform/reaction",
  "channelId_facet": "d1541dde-47b6-444c-b2ce-52668dd8d3cd",
  "channelKind_facet": "chat",
  "communityId_facet": "8e914cd4-a155-40cb-84e7-f476d3429d5c",
  "date_facet": 1656028800,
  "personaId_facet": "62f13402-1393-4963-a0c0-89d10e1237f3",
  "primaryLabel": "Added an emoji: :airplane:",
  "timestamp": 1656087039

..and what I’d like to be able to do is group by date_facet, and have counts of other facets by that group, filtered by (potentially) a large number of personaId_facet, as I mentioned above.

I can see how we can manually build rollup tables of all this data, but it feels annoying that we have to define what we can get out ahead of time - any advice on tech that can tackle this kind of problem without too much hassle, from flat documents, with consistent performance?
Sep 09, 2022 (15 months ago)
Kishore Nallan
Photo of md5-4e872368b2b2668460205b409e95c2ea
Kishore Nallan
03:18 AM
I will recommend using Clickhouse for this. In addition to supporting a full fledged SQL engine, it will allow you to quickly create aggregated materialized views (https://altinity.com/blog/clickhouse-materialized-views-illuminated-part-1) that are kept upto date as new data is ingested. These aggregated views will be much faster to query so you can run real-time reports.