Berlin Buzzwords 2025

More Than Just The Tip Of The Iceberg
2025-06-17 , Frannz Salon

A comprehensive workshop in which you will gain practical knowledge about how to deploy, configure, interact with and use advanced features of Apache Iceberg. Presented using a local coding environment based on Jupyter notebooks and a Docker Compose stack.


In recent years, several table formats for large datasets have emerged to help data engineers deal with complexity of handling substantial amounts of data in a flexible, performant and safe way. One of the most popular among those formats is Apache Iceberg.

In this workshop, you will gain up-to-date, hands-on experience on how to work with Iceberg. Using a local coding environment based on Jupyter Notebooks and a Docker Compose stack, you are going to:

  1. Learn about required components of a data processing system that uses Iceberg.
  2. Practice examples of how to update and query Iceberg using several query engines and libraries.
  3. Use advanced features of Iceberg, like flexible partitioning scheme, time travel or dataset branching.
  4. Learn about optimisation techniques and configuration "levers" you can pull to improve the overall performance and query speed of workloads using Iceberg.
  5. Peek under the hood of an Iceberg dataset, to understand its metadata and ways it improves query speed and supports data audits and lineage.

This workshop is recommended for Data Engineers, Analytics Engineers and Machine Learning Engineers wanting to improve their data pipelines and data processing workflows.


Tags:

Store, Scale

Level:

Advanced

Michal Gancarski is a software and data engineer with over ten years of experience gained freelancing, working for startups and at Zalando, where he helped to build some of the core components of the company's data lake. Currently employed as a staff data engineer for GROPYUS Technologies GmbH, he focuses on knowledge graphs and RDF datasets, helping the company disrupt and optimise the residential construction industry.

In addition to that, Michal is an Apache Iceberg instructor with video and live trainings on this table format published on the O'Reilly learning platform.

Michal holds a degree in Mathematics and Economics, as well as a graduate diploma in Data Science, both completed at the University Of London, under the academic direction of the London School of Economics.