Berlin Buzzwords 2025

Zero to Scale: Telemetry pipeline with Apache Cassandra
2025-06-16 , Maschinenhaus

Picture billions of messages pouring in daily from thousands of data providers around the globe, which are then processed and published to customers. How can one design a telemetry system to capture, publish, and then index essential information about the data flowing through the system to give internal teams visibility to aid in troubleshooting?


A core part of our business is to receive and then process humongous amounts of financial data from all over the globe. This pipeline scales to tens of billions of pricing messages every day, in which each message carries highly valuable information. Getting visibility into what was sent to us versus what was published to our customers is of utmost importance to enable internal teams to quickly troubleshoot issues reported by any of the data providers.

But how do we capture essential information about the data flowing through such a massive and high throughput system scaling to more than tens of thousands of processes running on close to a hundred machines, in which traffic peaks at more than a million messages per minute? In this talk, we will talk about how we built a high throughput telemetry system for streaming, storing, and searching such a high volume of data, starting from scratch using open source technologies like ZeroMQ, Apache Kafka, Kubernetes, and Apache Cassandra. You will gain valuable insights into the system’s design and performance, as well as the lessons we learnt along the way. We will cover everything from schema design and load testing to incremental deployment in order to manage such high data throughput.


Tags:

Search, Stream, Scale

Level:

Intermediate

Shikhar Srivastava is a Senior Software Engineer on the Real-time Contributions Engineering team at Bloomberg in London, where he designs and builds high-performance financial data systems. Shikhar is passionate about exploring innovative technologies to enhance real-time data processing. His career journey spans from developing machine learning models for ETA prediction at startups in India to architecting low-latency market data solutions at Bloomberg. Lately, he has been diving deep into Apache Cassandra, making use of its distributed database capabilities to tackle scalability challenges.

Nomin-Erdene Oyun is a Senior Software Engineer on the Real-time Contributions Feeds Infrastructure Engineering team at Bloomberg in New York. With a strong interest in building impactful software solutions, she focuses on developing real-time data infrastructure and high-performance processing pipelines that drive transparency and enable data-driven decision making in the financial space. She enjoys the creative and technical journey from concept to deployment, and has been involved in bringing multiple projects to life from the ground up over the course of her career.