Berlin Buzzwords 2025

I turn complex engineering challenges into scalable, high-impact solutions—while building rockstar teams along the way. With a deep expertise in search, recommender systems, and distributed systems, I thrive at the intersection of machine learning, engineering, and business growth.
If you like to geek out over AI, search, or the magic of data-driven decisions, let’s connect!

Cross Domain Enterprise Search - Content Diversity at Scale

Adrien Grand

Adrien has been a committer on the Apache Lucene project since 2012, with a focus on ease of use, search efficiency and storage efficiency.

Shipping Lucene 10.0, 25 years in the making

Alessandro Benedetti

Alessandro Benedetti is an Apache Lucene/Solr committer and Solr PMC member, Director at Sease Ltd.
He believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research. 
Alessandro is a passionate R&D software engineer, continuously applying the latest trends in Information Retrieval and AI to solve search problems. He’s been working on Learning To Rank for years and more recently he’s been exploring Generative AI techs like Large Language Models and Retrieval Augmented Generation. 
When he isn't on clients' projects, he contributes to the open-source community and presents at meet-ups and conferences such as ECIR, Search Solutions, Community Over Code, Haystack and Berlin Buzzwords.

End-to-End Semantic Search with Apache Solr 9.8 LLM Module

Alessio Vertemati

I'm a software architect and developer. I started as developer on a wide range of technologies spanning from networks, automation and web before becoming technological advisor for knowledge management and document management projects. I'm approaching AI from a deterministic perspective.

Contexts & Machines: How Document Parsing Shapes RAG results

Aline Blankertz

Aline is an applied economist with a special interest in the data economy, competition policy and platform regulation. She currently works as the Tech Economy Lead at the anti-monopoly organisation Rebalance Now and has co-founded the digital policy collective Structural Integrity. She has been involved in digital and data policy for various years, at Wikimedia Germany, the think tank interface and an economic consultancy, among others.

Unpacking Digital Sovereignty: How to avoid fueling the nationalist rise

Andrea Ponti

I am a Data Scientist with a Master's degree in Computer Science. I combine academic rigour with hands-on industrial experience, using cutting-edge technologies at the intersection of research and practice.

My research focuses on the optimisation of black-box functions using advanced Bayesian methods. From an industrial perspective, I specialise in the development of versatile machine learning solutions, with a focus on foundation models and Large Language Models (LLMs, aka what's behind ChatGPT).

I am fluent in Italian and English and can converse with AI models.

Contexts & Machines: How Document Parsing Shapes RAG results

Andrew Musselman

Andrew works on data and analytics, and runs software teams for a living. He has contributed to the Apache Mahout project for over a decade and has been an ASF member for four years.

Qumat: Apache Mahout Quantum Compute

Anna Ruggero

Hi!
I’m Anna Ruggero, an IT consultant in the information retrieval world.
I support clients in the process of improving their search engines with the implementation of innovative personalized solutions. I specialize in the integration of machine learning techniques with information retrieval systems, from Learning-to-Rank techniques to Neural Searches and Recommender Systems.
I extensively worked on e-commerce websites, improving their performance by developing personalized models and evaluation systems.

I highly believe in innovation and research, keeping up-to-date with the latest academic studies and contributing to them. I participated in the European Conference of Information Retrieval 2022 with a poster on offline and online evaluation in the industry and published a paper on improving interleaving techniques for the evaluation of information retrieval systems at the ECIR 2023.

I can't wait to talk about search with you!

AI-Powered Search Results Navigation with LLMs & JSON Schema

Athanasios Papaoikonomou

Senior ML Engineer / NLP at Elastic

Exploring reranking depth in modern search pipelines

Atri Sharma

Distributed systems and information retrieval guy

Cross Domain Enterprise Search - Content Diversity at Scale

Ben Gutkovich

Ben is the Co-Founder & COO of Superlinked.com, a compute and data engineering framework for turning data into vector embeddings, designed for building GenAI-powered RAG, Search, Recommender, and Analytics systems, while retaining control and maximising retrieval quality.

Previously, Ben supported C-level executives at large multi-national tech and media corporations with Growth and Operations Strategy as a Manager at McKinsey & Company and led Business Development at easyCar Club (acquired by Turo). Ben holds a bachelor's degree in Computer Science and an MBA from London Business School.

Mixture of Encoders: A Vector-Native Approach to Search

Berlin Buzzwords Team

Closing Session
Opening Session
Get-Together

Bilge Yücel

She is a developer relations engineer at deepset and is passionate about RAG, LLMs, and all things Gen AI. She enjoys making complex AI concepts accessible to all and helps developers build powerful AI applications with Haystack and beyond.

Go Beyond Basic RAG with Agentic Behavior

Carly Richmond

Carly is a Developer Advocate and Manager at Elastic, based in London, UK. Before joining Elastic in 2022, she spent over 10 years as a technologist at a large investment bank, specialising in front-end web development and agility. She is a UI developer who occasionally dabbles in writing backend services, a speaker, and a regular blogger.

She enjoys cooking, photography, drinking tea, and chasing after her young son in her spare time.

Observability for All!

Celeste Horgan

Celeste is a Developer Educator at Aiven, a managed database services company heavily invested in the PostgreSQL ecosystem. She has been involved in open source software as a technical writer and contributor for the Kubernetes project since 2020, and has had her work on inclusive language in tech featured in the New York Times.

Flavors of PostgreSQL® and you: how to choose a Postgres

Charlie Hull

I am a leading figure in the search industry, known for an honest, neutral and pragmatic viewpoint; I have held multiple roles including senior consultant, strategic advisor, project manager, sales & marketing director, conference organiser & speaker, trainer, writer & mentor. I am deeply connected with the business & technology of website and enterprise search engines with particular experience of small, high-value consulting companies. My past experience in software engineering gives me a highly informed perspective on search technology with a particular focus on open source platforms such as Lucene, Apache Solr, Elasticsearch and OpenSearch. More recently I have helped several companies use modern AI techniques to supercharge search. I co-wrote Searching the Enterprise, ran the Haystack conference for 5 years and held leadership positions at OpenSource Connections and Flax.

Breaking Search For Fun and Profit

Danica Fine

Danica began her career as a software engineer in financial services and pivoted to developer relations, where she focussed primarily on open source technologies under the Apache Software Foundation umbrella such as Apache Kafka and Apache Flink. She now leads the open source advocacy efforts at Snowflake, supporting Apache Iceberg and Apache Polaris (incubating).

She can be found on X, Bluesky, and Mastodon), talking about tech, plants, and baking @TheDanicaFine.

Quiet on Set: Building an On-Air Sign with Open Source Tech

Dennis Berger

Dennis Berger is a search and software engineer working at Otto. He designs and develops robust search backends using Apache Solr, Rust, and Java. Additionally, he explores ways to integrate emerging technologies with established systems. His work includes crafting microservice-based architectures and finding innovative ways to solve problems. Recognized for his technical precision, Dennis continues to push the boundaries of search with modern technologies.

What you see is what you mean; intent based ecommerce search

Dhrubo Saha

Dhrubo Saha is a machine learning engineer at Amazon Web Services (AWS) interested in machine learning algorithms, large language models, and distributed systems.

Advancing Multi-Modal Search Capabilities in Search Pipeline

Dotan Horovits

Horovits lives at the intersection of technology, product and innovation. With over 20 years in the hi-tech industry as a software developer, a solutions architect and a product manager, he brings a wealth of knowledge in cloud and cloud-native architectures, big data solutions, DevOps practices and more.

Horovits is an international speaker and thought leader, as well as an Ambassador of the Cloud Native Computing Foundation (CNCF). He runs the successful OpenObservability Talks podcast, and is a sought writer.

Currently working as senior developer advocate for the Open Source Strategy & Marketing team at AWS, Horovits evangelizes on the OpenSearch open source project by the Linux Foundation.

What’s New in the OpenSearch Project and Ecosystem

Edward Lambe

Edward Lambe is the Head of the MED Data Engineering team and the Deputy Head of MED IT at the Bank for International Settlements (BIS). Since joining the BIS in 2016, Edward has overseen the implementation of several key projects within the IT unit of the Monetary and Economic Department. Notably, he led the delivery of the BIS Data Portal, a core initiative of the BIS 2025 Innovation programme aimed at modernising the dissemination of BIS statistics. Prior to his tenure at the BIS, Edward held various statistical and IT roles at the Central Statistics Office, Ireland, and the Bank of Ireland. He holds a master’s degree from the Cork Institute of Technology and a bachelor’s degree from the National University of Ireland, Cork.

AI-Powered Search Results Navigation with LLMs & JSON Schema

Eric Pugh

Eric Pugh is the co-founder of OpenSource Connections. Today he helps OSC’s clients, especially those in the ecommerce space, build their own search teams and improve their search maturity, both by leading projects and by acting as a trusted advisor.

He is an active maintainer on the OpenSearch Documentation project, and is focused on expanding the suite of Search Relevance features in the OpenSearch Project.

Fascinated by the craft of software development, Eric Pugh has been involved in the open source world as a tester, developer, committer and user for the past twenty years. He is a member of the Apache Software Foundation and co-authored the book Apache Solr Enterprise Search Server, now on its third edition.

OpenSource Connections mission to empower the world’s search teams comes directly from Eric’s belief in the open source software movement, and the importance of educating people to succeed with it, so that people own their technology.

When not thinking about search, Eric likes to get his hands dirty by building furniture. His next project is a reproduction Danish modern couch, using just hand tools!

Streamlining Search Quality: Search Relevance Workbench

Evgeniya Sukhodolskaya

Developer Relations at Qdrant with 7 years of IT experience across software engineering, machine learning, and technical management, and 3 years in Developer Relations. Holds a Master’s in Machine Learning, Data Analytics, and Data Engineering. Passionate about NLP, data-centric AI, and the role of vector databases in advancing AI technologies.

miniCOIL: Sparse Neural Retrieval Done Right

Fatima Taj

Fatima is a Senior Software Engineer at Yelp with a deep passion for mentoring early-career tech professionals. She has successfully guided many individuals through their first steps in the tech industry, helping them overcome challenges and achieve their career goals. In addition to her mentorship, Fatima is a prominent voice in the tech community, with a substantial following on LinkedIn, where she shares actionable insights on career development and growth.

An experienced speaker, Fatima has presented at leading conferences including Developer Week 2024, the Southern California Linux Expo (Scale) 2023 and 2024, NDC Copenhagen Developer Festival 2023, Women of Silicon Roundabout 2022, cdCon+GitOpsCon 2023 (as a keynote speaker), Momentum 2024, and the Black is Tech Conference in 2022 and 2023. She has also spoken at over 80 hackathons across North America. Her sessions are renowned for their practical, hands-on advice, making her a sought-after speaker on topics related to career progression and professional growth in the tech industry.

Fatima holds a master's degree in Data Science from HEC Montreal and a bachelor’s degree in Mathematics from the University of Waterloo, Canada.

How I Sidestepped ‘Being Glue’

Filip Makraduli

Filip Makraduli is a machine learning engineer and developer advocate with a strong background in AI systems, vector search, and large language models (LLMs). He holds a Master’s degree in Biomedical Data Science from Imperial College London. Currently, Filip works as a founding developer relations engineer at Superlinked, where he focuses on building real-time, multi-attribute search and recommendation systems. His work emphasizes the use of multi-encoder architectures to enhance retrieval quality and reduce reliance on reranking strategies. In the past, Filip worked as a data scientist at Marks & Spencer, where he contributed to AI-driven solutions for retail. He has also held machine learning engineering roles across several UK-based startups, focusing on applied AI and product-oriented ML development. In addition to his industry work, Filip has been active in the open-source community, particularly around LLM tooling and pipelines. He has delivered various talks on practical machine learning applications, including a presentation on AI-powered music recommendation systems titled “When music just doesn’t match our vibe, can AI help?” Filip is passionate about bridging the gap between cutting-edge AI research and real-world applications, particularly in the areas of personalization, search, and recommendation systems. He also has a strong interest in the business side of technology, especially how product, research, and engineering decisions align with go-to-market strategies, developer adoption, and long-term commercial value.

Mixture of Encoders: A Vector-Native Approach to Search

Frank Munz

Frank Munz solves large-scale data and AI challenges at Databricks. He authored three computer science books, built up technical evangelism for Amazon Web Services in Germany, Austria, and Switzerland, and once upon a time worked as a data scientist with a group that won a Nobel prize.

Frank has presented at top-notch conferences on every continent (except Antarctica, due to its inhospitable climate). His speaking engagements include Devoxx, Kubecon, and Java One.

He is renowned for his world-class demos, which often showcase innovative and interactive applications of technology. Some notable examples include:

Once Frank spit into a test tube, got his DNA analyzed and shared it with attendees using OSS Delta Sharing to let them explore his personal coffee metabolism snip.
Last year at the Data+AI Summit, Frank created a crowd-sourced distributed earthquake detection system that ingested streaming data from 250 attendees’ phone motion sensors at a rate of 100 million IoT events per day.

He holds a Ph.D. with summa cum laude in Computer Science from TU Munich where he worked on Supercomputing in brain research (a system that allows better diagnosis for children with epilepsy possibly undergoing brain surgery)

Analysing Public Kafka Data from NASA Satellites

Fred O'Loughlin

Senior MLOps Engineer & Tech Lead for the Platform team at Climate Policy Radar

Building a knowledge graph for climate policy

Giovanna Monti

I am a software developer with a passion for front-end. I love sharing thoughts and experiences with the tech community, and that's why I started my speaker journey in 2023.
My motto? Understand things, before you do them. And, in case of doubt, don't be afraid to ask for the millionth time!

Harnessing AI to strengthen trustworthy information

Guy Shtub

Guy is experienced in creating products that people love. Previously, he co-founded two startups. Outside of the office, you can find him climbing, juggling, and generally getting off the beaten path. Guy holds a B.SC. degree in Software Engineering from Ben Gurion University.

Performance & Fault Tolerance: Building a Modern Database

Gülçin Yıldırım Jelinek

Gülçin started working with Postgres at a startup company in 2012 and was amazed at how powerful Postgres truly is! Over the years, she has actively contributed to the PostgreSQL community by organizing conferences, delivering talks, and engaging as a dedicated community member. In recognition of her commitment, Gülçin was elected to the PostgreSQL Europe Board in 2017.

Fueled by her passion for PostgreSQL automation and cloud technologies, Gülçin took on the role of Cloud Services Manager and led the cloud development efforts at 2ndQuadrant, which was later acquired by EDB in 2020. Committed to fostering diversity and inclusion, she is an integral part of Postgres Women, advocating for increased representation of women in technical communities.

Currently, Gülçin is a Staff Database Engineer at Xata, where she continues to explore her interests in PostgreSQL. In addition to her engineering work, she is one of the co-founders of Kadin Yazilimci (Women Developers of Turkey) and has led the core team for more than 10 years. In 2023, she launched Diva: Dive into AI as a Kadin Yazilimci initiative and has been part of the organizing team since.

She is now recognized as a PostgreSQL Contributor by the Postgres project. Being part of PostgreSQL Europe Diversity Committee, she looks forward to serving the community and contributing to the project's longevity and health. Gülçin lives in Prague and is the co-founder and organizer of the monthly Prague PostgreSQL Meetup.

Anatomy of Table-Level Locks in PostgreSQL

Harrison Pim

I'm a data scientist / machine learning engineer with a background in computational / quantum physics. I write loads of python and typescript, and a little bit of everything else.

I like working on hard R&D problems involving computer vision, natural language processing, graph theory, representation learning, recommendation systems, and information retrieval.

I love turning those research projects into end-to-end pipelines and services which help people in the real world.

Building a knowledge graph for climate policy

Ilaria Petreti

Ilaria is a Data Scientist with a background in Machine Learning and Natural Language Processing for Information Retrieval systems. Since joining the Sease team in 2020, she has worked on various projects, focusing on integrating Learning To Rank and Search Quality Evaluation in e-commerce ecosystems. More recently, she has been exploring the potential of Vector Search and Large Language Models in Search, leveraging these technologies to enhance retrieval strategies and improve result relevance.

Beyond her work, she is an active information retrieval research community member, regularly sharing her insights through blog posts, contributing to open-source projects, and speaking at international conferences such as Berlin Buzzword and ElasticON.

AI-Powered Search Results Navigation with LLMs & JSON Schema

Isaac Chung

My focus is on making AI systems usable, scalable, and maintainable. I'm currently a Staff Data Scientist at Zendesk QA, working on LLM-powered features that see millions of conversations a day.

Previously at Clarifai, I helped build and maintain multimodal retrieval systems in production. My background is in Aerospace Engineering and Machine Learning and I hold undergraduate (B.A.Sc in EngSci) and graduate (M.A.Sc) degrees from the University of Toronto.

In my spare time, I am a maintainer for MTEB, I like to see the world, and do a bit of swim/bike/run racing.

Reproducibility in Embedding Benchmarks

Isabelle Mohr

Isabelle is a Machine Learning Engineer at Jina AI, where she develops and trains embedding models, working closely with her team to push the boundaries of what’s possible. Passionate about knowledge sharing, she regularly gives talks on machine learning and NLP, inspiring and connecting with others in the field.

Visual Literacy: Complex Document Retrieval with VLMs

Ivan Dolgov

Senior MLE@JetBrains

Training models which write code-related things

How to train a fast LLM for coding tasks

Jan Meskens

Jan Meskens is a seasoned data consultant with over a decade of experience in various data consulting roles. Through his consulting firm, Sievax, Jan has been pivotal in helping companies successfully integrate and implement data-driven strategies.

In academia, Jan shares his expertise with students at University College, where he teaches courses focused on artificial intelligence and data-centric topics. Beyond his consultancy and teaching, he actively contributes to the broader data community by writing insightful articles
on Medium and presenting on data-related subjects at numerous conferences, meetups, and workshops.

Holding a PhD in Human-Computer Interaction, Jan brings a unique perspective to the fields of data and artificial intelligence. His guiding principle is clear: making data usable and understandable for everyone within an organization leads to valuable insights and
outcomes.

Data Quality Management: The Good, The Bad, and The Messy

Jarek Potiuk

Independent Open-Source Contributor and Advisor, Committer and PMC member of Apache Airflow, Member of the Apache Software Foundation, Security Committee Member of the Apache Software Foundation. Organizer of community-focused events, speaker.

Jarek is an Engineer with a broad experience in many subjects - Open-Source, Cloud, Mobile, Robotics, AI, Backend, Developer Experience, Security, but he also had a lot of non-engineering experience - building a Software House from scratch, being CTO, organizing big, international community events, technical sales support, pr and marketing advisory but also looking at legal aspects of security, licensing, branding and building open-source communities are all under his belt.

With the experience in very small and very big companies and everything in-between, Jarek found his place in Open-Source world, where his internal individual-contributor drive can be used to the uttermost of the potential.

Airflow 3 - the new beginning

Javier Ramirez

Developer Advocate at QuestDB and all around happy person. Fan of Open Source, Tech Communities, Data, and ML. He/him

Accelerating QuestDB: Lessons from a 6x Performance Boost

Jennifer Gaubatz

Entrepreneur in the AI Space, ex-McKinsey, Medical Doctor

Why Chatbots Still Fail: The Hidden Pitfalls of RAG

Jessie de Groot

Jessie is VP of People & Culture at Weaviate, a remote-first and open source AI-native start-up. Jessie is passionate about everything related to creating great people programs and sustaining a strong remote company culture.

Jessie loves to talk about remote-first culture, everything people-related, traveling, interior design, coffee and food.

From Culture to Open Source: Build Value-driven Communities

Justin Mclean

Justin Mclean is a highly experienced professional with over 30 years in web application development, education, and community work, and is an active contributor to open source software. Justin is a renowned speaker at conferences worldwide and currently serves as the Community Manager at Datastrato. He mentors projects in the Apache Software Foundation and holds positions as VP of the ASF Incubator, and is an ASF board member.

A decade of lessons in Open Source licensing

Kevin Liang

Kevin Liang is a software engineer on Bloomberg's Search Infrastructure engineering team in New York. As part of a team that offers search-as-a-managed-service, he works closely with Apache Solr day in and day out. This includes everything from the application-level software down to the bare-metal hardware. Recently, his work has focused increasingly on support for dense vector search, but has also covered a variety of subjects, including automation, backups, and major version upgrades.

Performance Tuning Apache Solr for Dense Vectors

Lars Albertsson

Lars Albertsson is the founder of Scling, a data engineering startup based in Stockholm. Scling provides customer tailored data engineering, analytics, and artificial intelligence implementations. Lars is a frequent conference speaker on data engineering and data strategy. Before founding Scling, Lars has worked at Google, Spotify, Schibsted, and as an independent consultant, helping organisations create business value from data processing and AI.

All the DataOps, all the paradigms

Lester Martin

Lester Martin is a seasoned developer advocate, trainer, blogger, and data engineer focused on data pipelines & data lake analytics using Trino, Iceberg, Hive, Spark, Flink, Kafka, NiFi, NoSQL databases, and, of course, classical RDBMSs. Check out Lester's blog at https://lestermartin.blog.

Apache Iceberg ingestion with Apache NiFi

Lewin von Saldern

Entrepreneur in the AI space, ex-McKinsey

Why Chatbots Still Fail: The Hidden Pitfalls of RAG

Luca Cavanna

Luca Cavanna is an Apache Lucene committer / PMC member, and principal engineer at Elastic. At Elastic he operates as technical lead of the Elasticsearch Search Foundations team. In Lucene, his main focus is on search concurrency, as well as fixing all the things and shipping releases.

Shipping Lucene 10.0, 25 years in the making

Lucian Precup

Lucian Precup is the CTO of all.site - the collaborative search engine developed at Station F in Paris. With his colleagues at Adelean, Lucian develops solutions for indexing, searching and analyzing data. Lucian regularly shares his knowledge in specialized conferences and organizes the Search, Data & AI Meetup.

Harnessing AI to strengthen trustworthy information

Manish Gill

Manish Gill works at ClickHouse Inc, where he is managing the AutoScaling team for ClickHouse Cloud. He is based out of Berlin and is deeply interested in Databases and Cloud challenges and still considers himself new to Kubernetes.

In a past life, he worked in an ML research team doing Traffic prediction for at Global Scale and was a Data Engineer for more than half a decade before that.

When StatefulSets are not enough

Marco Petris

I'm a Senior Software Developer and I'm currently working on AI driven search at OTTO.

What you see is what you mean; intent based ecommerce search

Marion Nehring

I am Marion - Community Manager and DX User Research Lead at Weaviate, tech innovation enthusiast, and lover of fantasy books! I am a highly positive personality with a great sense of humor and a strong human-centered and growth-minded approach to everything.

💞 During my 20 years in tech my biggest passion was (and still is) uniting people and tech in order to tackle everyday challenges, grow innovation, and drive change for a sustainable future with the help of technology.

So I am getting very excited when people (especially developers) come together to build the next big thing, be creative, and help each other be their most successful and the best version of themself.

From Culture to Open Source: Build Value-driven Communities

Michal Gancarski

Michal Gancarski is a software and data engineer with over ten years of experience gained freelancing, working for startups and at Zalando, where he helped to build some of the core components of the company's data lake. Currently employed as a staff data engineer for GROPYUS Technologies GmbH, he focuses on knowledge graphs and RDF datasets, helping the company disrupt and optimise the residential construction industry.

In addition to that, Michal is an Apache Iceberg instructor with video and live trainings on this table format published on the O'Reilly learning platform.

Michal holds a degree in Mathematics and Economics, as well as a graduate diploma in Data Science, both completed at the University Of London, under the academic direction of the London School of Economics.

More Than Just The Tip Of The Iceberg

Miloš Sutanovac

Miloš Sutanovac is a software engineer with nearly a decade of experience, including work with companies like BMW and Deutsche Telekom. He currently focuses on local-first architectures and building scalable, resilient applications. With a background in education, Miloš has mentored hundreds of students and enjoys sharing his knowledge through talks and workshops.

Going Local-First: A Primer

Muhammet Orazov

Muhammet is Software Engineer at Ververica, the original creators of Apache Flink®. He is member Engine team that develops various Flink engines for different platforms. He is experienced in databases, distributed systems and started his journey in streaming systems at Ververica.

FlinkCDC: Streamlining your data analytics pipelines

Nick Burch

Nick is has been heavily involved in a number of Apache projects, such as Tika and POI, while having the fortune to know many of the people involved in the Apache Big Data and Search space! When not helping out with Apache things, Nick works as the Director of Engineering at Saible, where he leads a team making heavy use of Open Source technologies. When not helping ensure everyone gets paid, he is often to be found attending or organising BarCamps, Geek Nights, or other such fun events dedicated to sharing what's great and new!

Self-hosting AI LLMs - a beginners guide
Barcamp

Nikolay Sivko

Nikolay Sivko, co-founder and CEO of Coroot, aims to simplify troubleshooting in production for developers. He is passionate about Site Reliability Engineering practices, observability, and open source. Previously, he was the head of the Engineering group at a large technology company and founded an observability tool development company in Russia, which he successfully acquired. Currently, he resides in Turkey, focusing on developing a startup with an international market orientation.

Delay accounting: an underrated feature of the Linux kernel

Nomin-Erdene Oyun

Nomin-Erdene Oyun is a Senior Software Engineer on the Real-time Contributions Feeds Infrastructure Engineering team at Bloomberg in New York. With a strong interest in building impactful software solutions, she focuses on developing real-time data infrastructure and high-performance processing pipelines that drive transparency and enable data-driven decision making in the financial space. She enjoys the creative and technical journey from concept to deployment, and has been involved in bringing multiple projects to life from the ground up over the course of her career.

Zero to Scale: Telemetry pipeline with Apache Cassandra

Olena Kutsenko

Olena is a Staff Developer Advocate at Confluent and a recognized expert in data streaming and analytics. With two decades of experience in software engineering, she has built mission-critical applications, led high-performing teams, and driven large-scale technology adoption at industry leaders like Nokia, HERE Technologies, AWS, and Aiven.

A passionate advocate for real-time data processing and AI-driven applications, Olena empowers developers and organizations to use the power of streaming data. She is an AWS Community Builder, a dedicated mentor, and a volunteer instructor at a nonprofit tech school, helping to shape the next generation of engineers.

As an international speaker and thought leader, Olena regularly presents at top global conferences, sharing deep technical insights and hands-on expertise. Whether through her talks, workshops, or content, she is committed to making complex technologies accessible and inspiring innovation in the developer community.

Mastering real-time anomaly detection with open source tools

Peter Zaitsev

Peter Zaitsev is an entrepreneur and co-founder of Percona, Coroot, FerretDB and other tech companies. As one of the leading experts in Open Source strategy and database optimization, Peter has applied his technical knowledge and entrepreneurial drive to contribute as a board member and advisor to several open source startups. Additionally, Peter is the co-author of the book "High Performance MySQL: Optimization, Backup and Replication," one of the most popular books on MySQL performance.

Best Practices for Running Databases on Kubernetes

Pietro Mele

Italian, adopted by France not long ago, I am a constant learner, dedicated to computer science and discovery—whether uncovering solutions or gaining insights.

Speaker at :

ElasticON 2023 - Searching through large graphs using Elasticsearch
Devoxx France 2023 - Cloning CHATGPT with ElasticSearch and HuggingFace
10th Meetup Search & Data - Construire une API conversationnelle au dessus d'un moteur de recherche
Haystack US 2023 - Dive into NLP with the Elastic Stack
VoxxedDays Luxembourg 2023 - Cloner ChatGPT avec Hugging Face et Elasticsearch
DevoxxMorocco 2023 - Conversational Search - Unleashing the Power of Voice Search, Question Answering, and LLMs
DevFest Toulouse 2023 - Cloner ChatGPT avec Hugging Face et Elasticsearch
11th Meetup Search & Data - Exploration of an Open Source Rag System
Devoxx France 2024 - Mettre en place un RAG Open Source en 30 minutes
Devoxx France 2024 - Construire son Assistant Intelligent avec Hugging Face et Elasticsearch
OpensearchCon EU 2024 - Implementing an open-source RAG with OpenSearch
VoxxedDays Luxembourg 2024 - Home Assistant sous surveillance
Devoxx Morocco 2024 - A practical guide about prompt engineering
1st OpenSearch France UG - To the discovery of OpenSearch AI superpowers!
Big Data Europe 2024 - Exploring Large Graphs at the Heart of the French National Audiovisual Institute
-ElasticON 2025 - Billion vectors baby
-Devoxx UK 2025 - Exploring Large Graphs at the Heart of the French National Audiovisual Institute
-OpensearchCon EU 2025 - Monitoring a smart home with Opensearch

Hybrid search on hybrid models, at scale

Piotr Kobziakowski

Piotr Kobziakowski is a Senior Principal Solutions Architect at Vespa.ai, where he leverages over 20 years of expertise in software architecture, network security, big data, and search technologies to design scalable AI-driven solutions for global enterprises. Based in Warsaw, Poland, he specializes in advising organizations on data, analytics and search applications.
Prior to joining Vespa.ai in October 2024, Kobziakowski held progressive technical roles at Elastic, where he architected search and analytics solutions for telecommunications. His career spans across industry leaders like Akamai, Nominum, Cloudmark and Bytemobile, with a focus on optimizing large-scale data and analytics infrastructure and security systems.
Piotr’s approach combines hands-on technical advisory with strategic problem-solving,
through delivering workshops and customized training programs. He is recognized for translating complex technical concepts into actionable roadmaps, enabling enterprises to operationalize technology capabilities. A frequent speaker at many events related to GenAI, Data and Analytics.

Vespa.ai’s Personalized Search: Advanced Ranking & Tensor framework

Radu Gheorghe

Radu has been in the search space for many years, mainly on Elasticsearch, Solr, OpenSearch, and, more recently, Vespa.ai. Helps users with both the relevance and the operations side of retrieval. Enjoys education in all its forms (training, blog posts, books, conferences...) and got the chance to be involved in all of them.

Which GPU for Local LLMs?

Radu Pop

Radu provides consulting services as a Solutions Architect at Adelean. He handles projects around Elasticsearch and Adelean’s A2 search technology. He oversees the integration and evolution of search engines within large e-commerce platforms, marketplaces, or organizations' data lakes. Prior to joining Adelean, Radu acquired solid experience in web archiving, operating large-scale crawling systems in the context of several European research projects. He holds a PhD in Computer Science and an MSc in Distributed Systems.

Hybrid search on hybrid models, at scale

Rafał Kuć

Software engineer, trainer, consultant and author from time to time - some would say that he is an all in one battle weapon concentrated on information retrieval, performance and user search experience. However he also likes all the other cool stuff that is happening in the IT world. Likes to share his knowledge by giving talks at various meet ups and conferences.

Which GPU for Local LLMs?

Raphael Franke

Raphael Franke is a Data Scientist at the Application Lab for Artificial Intelligence and Big Data at the German Environment Agency. With an academic background in mathematical statistics and data analysis he specializes in applying AI to real-world environmental challenges. His interests lie in probabilistic time series forecasting and leveraging data-driven insights for sustainable impact.

gamma_flow: Denoise, classify and disentangle spectral data!

Roman Grebennikov

A principal ML engineer and an ex startup CTO working on modern search and recommendations problems. A pragmatic fan of open-source software, functional programming, LLMs and performance engineering.

How [not] to evaluate your RAG

Roman Kolesnev

Roman is a Principal Software Engineer at Streambased. His experience includes designing and building business critical event streaming applications and distributed systems in the financial and technology sectors.

Melting Icebergs: Direct access to Kafka Data via Iceberg

Saba Sturua

Saba is an ML Research Engineer in the Model Training team at Jina AI, where he develops state-of-the-art text and multimodal embedding models, focusing on enhancing search capabilities.

Visual Literacy: Complex Document Retrieval with VLMs

Saurabh Singh

Saurabh is a Software Development Manager at AWS leading the core search, release, and benchmarking areas of the OpenSearch Project. His passion lies in finding solutions for intricate challenges within large-scale distributed systems.

From Search to Insight: Leveraging OpenSearch for Scalable, AI-Driven Search Experiences

Sebastian Lenartowicz

Right often enough that it's probably not coincidence.

Precision farming powered by K3s and TensorRT

Shikhar Srivastava

Shikhar Srivastava is a Senior Software Engineer on the Real-time Contributions Engineering team at Bloomberg in London, where he designs and builds high-performance financial data systems. Shikhar is passionate about exploring innovative technologies to enhance real-time data processing. His career journey spans from developing machine learning models for ETA prediction at startups in India to architecting low-latency market data solutions at Bloomberg. Lately, he has been diving deep into Apache Cassandra, making use of its distributed database capabilities to tackle scalability challenges.

Zero to Scale: Telemetry pipeline with Apache Cassandra

Sonam Pankaj

Sonam is a GenerativeAI Evangelist. She is also the author of embedanything, which is an opensource ingestion, inference and indexing solution in rust with more than 200k+ downloads and 500+ stars in past 9 months. She has previously worked in generative AI and conversational AI. She is also building StarlightSearch, a local and on-premise solution for search and agents in rust.

Text Search on Images with Quantized ColPali

Stavros Macrakis

Stavros Macrakis is the senior technical product manager for OpenSearch focusing on document and e-commerce search. He has worked on search for 20 years and is passionate about search relevance.

Streamlining Search Quality: Search Relevance Workbench

Steffen Hoellinger

Steffen Hoellinger is the co-founder and CEO of Airy, an innovative AI startup focused on building open source data infrastructure that combines the power of data streaming, stream processing, and AI. With a deep passion for the power of real-time, AI-driven insights, Steffen leads Airy in providing scalable, efficient solutions that empower enterprises to harness the full potential of generative AI and advanced machine learning and help shape the future of business.

Flink Jobs as Agents – Stream Processing for Agentic AI

Tom Scott

Long time enthusiast of Kafka and all things data integration, Tom has more than 10yrs experience (5yrs+ Kafka) in innovative and efficient ways to store, query and move data. Currently working at Streambased, Tom is building multi tenant, on-prem and cloud Kafka services to attack common Kafka pain points and break down barriers to starting your data journey.

Melting Icebergs: Direct access to Kafka Data via Iceberg

Trevor Grant

Trevor Grant is getting back into speaking at conferences after a hiatus from an otherwise prolific career that was put on pause during the pandemic. During his pause he became a father, published a book (Kubeflow for Machine Learning: From Lab to Production), had a second son, consulted for a while, went back to work for IBM Research, and became car free (this list is not ordered chronologically nor by significance). He has been putzing around generative AI since ~2017, and someday hopes to give a talk on his Star Trek Chat bots of 2018-2020. His primary open source interests at the moment is the qumat project of Apache Mahout, and the gofannon project of The AI Alliance

Qumat: Apache Mahout Quantum Compute

Uwe Schindler

Uwe is committer and PMC member of Apache Lucene and Apache Solr. His main focus is on development of Lucene Core. He implemented fast numerical search and is maintaining the new attribute-based text analysis API. He studied Physics at the University of Erlangen-Nuremberg and works as managing director for SD DataSolutions GmbH in Bremen, Germany, a company that provides consulting and support for Apache Lucene, Elasticsearch, and Apache Solr. He also works for “PANGAEA – Publishing Network for Geoscientific & Environmental Data” where he implemented the portal's geo-spatial retrieval functions with Lucene Java. Uwe had talks about Lucene at various international conferences like the previous Berlin Buzzwords, ApacheCon EU/US, Lucene Revolution, Lucene Eurocon, and various local meetups.

State of native access in Apache Lucene

Ved Prakash

A Staff Data Engineer with over 15 years of experience in building enterprise data products. Currently pioneering the development of Siphon, a real-time data streaming product that enables reliable data delivery across Snowflake and Clickhouse using Apache Iceberg. Specializes in transforming traditional data pipelines into scalable data products with emphasis on reliability, observability, and user experience.
Their product engineering journey includes developing self-service data platforms, automated data quality frameworks, and real-time analytics solutions using Snowplow, Monte Carlo, and cloud-native technologies. They've successfully led the productization of data infrastructure across GCP and AWS, implementing infrastructure-as-code practices with Terraform and continuous delivery pipelines.
Passionate about building data products that deliver immediate business value, they focus on creating intuitive, reliable data solutions that empower organizations to make data-driven decisions with confidence. Their product-first approach combines technical expertise with user-centric design to deliver data solutions that scale.

Siphon : Modern Data Stack with SF-CH & Iceberg

Viola Rädle

Viola Rädle works at the interface between environmental and data science. She discovered her interest in environmental dynamics while studying physics at the University of Heidelberg. In her master's thesis, she researched groundwater systems and later deepened this topic through Bayesian data analysis. She expanded her Python skills as a junior researcher at HTWK Leipzig, where she worked on asphalt recycling and alternative methods of hydrogen production. Since 2023, she has been working as a data scientist at the Federal Environment Agency's AI Lab, where she supports authorities in the field of digitalization and data analysis. In addition to developing prototypes, where she is responsible for programming, project organization and science communication, she gives exciting and accessible lectures in the field of artificial intelligence.

gamma_flow: Denoise, classify and disentangle spectral data!

Volker Carlguth

Volker Carlguth has been working in the field of retail search engines for 20 years, taking on various roles as a software developer, consultant, and now as a product manager. He is passionate about understanding user intent and is particularly interested in applying emerging AI methods to this area.

What you see is what you mean; intent based ecommerce search

Wieneke Keller

tba

Precision farming powered by K3s and TensorRT

William Benton

William Benton is passionate about making it easier for machine learning practitioners to benefit from advanced infrastructure and making it possible for organizations to manage machine learning systems. His recent roles have included defining product strategy and professional services offerings related to data science and machine learning, leading teams of data scientists and engineers, and contributing to many open source communities related to data, ML, and distributed systems. Will lives in the midwestern United States with his wife and three children and spends some of his spare time chasing light on bicycles or capturing it with cameras.

"Do What I Mean": The History of AI and Program Synthesis

Yingjun Wu

Yingjun Wu is the founder of RisingWave Labs (https://www.risingwave.com/), a database company developing RisingWave, a distributed SQL database for stream processing. Before running the company, Yingjun was a software engineer at the Redshift team, Amazon Web Services, and a researcher at the Database group, IBM Almaden Research Center. Yingjun received his PhD degree from National University of Singapore, and was a visiting PhD at Carnegie Mellon University. He has been working in the field of stream processing and database systems for over a decade.

The Dark Secrets of Stream Processing

Yupeng Fu

Yupeng Fu is a Principal Engineer in the Platform Engineering organization at Uber. He leads the Search and Real-time Data Platforms. Yupeng is also an active contributor to open-source projects. He is an OpenSearch TSC member and Apache Pinot PMC.

Evolution of Uber's Search Platform