2025-06-17 –, Frannz Salon
Data warehouses, lakes, lakehouses, streams, fabrics, hubs, vaults, and meshes. We sometimes choose deliberately, sometimes influenced by trends, yet often get an organic blend. But the choices have orders of magnitude in impact on operations cost and iteration speed. Let's dissect the paradigms and their operational aspects once and for all.
I have seen dozens of data platforms and noticed how architectural choices are often made without regarding the operational consequences, resulting in excessive operational burden and slow development. These choices have huge impact on effectiveness of data-centric organisations and separate disruptive companies from legacy enterprises. I will explain how the common operational procedures – deployment, failure handling, late data, data quality problems, bug remediation – have different impact depending on data processing paradigm, and how to handle them with minimal cost and latency where possible. I will also cover when and how to bridge between the paradigms. I will finally share some innovations that we have discovered further improves development iteration speed and operational efficiency.
I have found that the distinction between different data processing paradigms is often not clear, and that their differences in practice is not concisely explained anywhere. This presentation is an attempt to create that explanation.
Stream, Scale, Operations
Level:Intermediate
Lars Albertsson is the founder of Scling, a data engineering startup based in Stockholm. Scling provides customer tailored data engineering, analytics, and artificial intelligence implementations. Lars is a frequent conference speaker on data engineering and data strategy. Before founding Scling, Lars has worked at Google, Spotify, Schibsted, and as an independent consultant, helping organisations create business value from data processing and AI.