Berlin Buzzwords 2024

Taming the cost of Kafka workloads in the cloud
2024-06-11 , Frannz Salon

This talk equips developers with everything they need to understand and optimize the cost of Kafka workloads in the era of cloud computing.


Cloud computing has changed how we operate and think about software: At any time, we have instant access to a seemingly unlimited capacity of compute resources and can scale our applications elastically on demand. We no longer have to procure physical machines and pay for them upfront but are billed for only those resources that we have actually used. While the usage-based pricing sounds great at first glance, it makes cost planning a complicated task, often leading to huge surprises when receiving invoices for cloud services, as revealed in numerous blog posts of authors waking up to invoice amounts that are multiple orders of magnitude higher than expected.

This talk explores the main cost factors of Kafka workloads in the cloud: compute, network, and storage. We identify their contribution to the overall cost and walk through different techniques for reducing their cost footprint by, for instance, using Kafka's follower fetching to reduce cross-AZ traffic, compression to reduce overall traffic, or scaling streaming applications "to zero" in the absence of incoming events to save compute resources. We focus on setups where both Kafka and associated workloads, such as Kafka Streams or Flink applications, are operated on cloud platforms.

The goal of this talk is to enable developers to benefit from the advantages of cloud computing in the context of Kafka workloads while taming the associated costs.

Stefan works as a staff software engineer at Confluent where he builds developer tooling for Kafka and other data streaming technologies. Previously, he co-founded a startup in the data streaming space, worked as a data engineer in the financial industry, and researched database systems on modern hardware. He loves Neapolitan pizza.