Tiny Flink — Minimizing the memory footprint of Apache Flink
2023-06-20 , Kesselhaus

We will explore options to run Apache Flink with a very low resource footprint, allowing users to run full streaming SQL queries or custom streaming applications on JVMs with less than 500mb


Apache Flink has been designed for, and is mostly used with large-scale real-time data processing use-cases. Companies report about TBs of data being processed per second, or TBs of state in huge clusters.

But what if you need to process low-throughput streams? Running a full, distributed Flink cluster might be an overkill, as there’s quite a bit of overhead for distributed coordination.

In this talk, we’ll explore options to reduce your resource footprint. We’ll dive deeper into Flink’s MiniCluster, allowing you to run Flink in-JVM for integration tests, as a micro service or just a small processing your data in Kubernetes. We will also discuss lessons learned from running MiniCluster in production for a service offering Flink SQL in the cloud.

Attend this talk if you want to learn about Apache Flink and its various options to deploy and configure it.

See also: Slides (2.7 MB)

Robert Metzger is a committer and PMC member at Apache Flink and a Staff Engineer at decodable. He previously co-founded and successfully exited data Artisans (now Ververica), the company originally creating and commercializing Flink. He is a frequent speaker at conferences such as the QCon, ApacheCon and meetups around the world.