A Crash Course in Error Handling for Streaming Data Pipeline
2023-06-20 , Palais Atelier

Learn how to handle errors in streaming data pipelines using concepts, such as dead-letter queues.


Streaming data pipelines pose unique requirements for the handling of errors and other malfunctions because they are executed continuously and cannot be manually supervised. As a consequence, we need to automate the handling of errors as much as possible.
This talk answers three critical questions in the context of data streaming: What are potential errors? How shall we handle the different kinds of errors? Which metrics help us to keep track of the health of streaming data pipelines?
We discuss (1) errors that happen when consuming Apache Kafka topics, e.g., when deserializing records, (2) errors that happen when producing records to Apache Kafka topics, e.g., when serializing data, (3) errors that happen when processing records, e.g., exceptions raised in data transformations, and (4) errors that are caused by external factors, e.g., when the streaming data pipeline exceeds available memory resources.
Once potential errors have been introduced, we show how to cope with them through design patterns, like dead-letter queues, or practical approaches, like log-based alerts.
Finally, we discuss important metrics for monitoring the health of streaming data pipelines, e.g., consumer lags, or producing rates for dead-letter topics.
While we use examples from Kafka Streams applications, the presented content can be easily transferred to other stream processing frameworks.

See also: Slides (5.2 MB)

Stefan is co-founder and CEO at DataCater GmbH, the company behind the real-time ETL platform based on Apache Kafka. He has more than 10 years of experience in software and data engineering and researched database systems on modern hardware during his PhD studies.