2.0 -//Pentabarf//Schedule//EN

PUBLISH NUWNJU@@program.berlinbuzzwords.de

-NUWNJU

Barcamp en

20260607T143000 20260607T173000 030000

Barcamp

Although the barcamp doesn't have a strict schedule, it won't be completely devoid of structure! #bbuzz barcamps are dynamic events, focused on the overall Berlin Buzzwords topics, tackling the same challenges but in a different format. At the barcamp each session runs for 30 minutes giving enough time to get into the meat of a topic, but without a chance of anyone getting bored. These are participatory sessions and more inclusive than regular conference talks, with everyone taking part. You can help by leading the session, by giving some insights, by asking some great questions, or maybe just with your enthusiasm. The barcamp will be coordinated and moderated by Nick Burch. Registration starts from 2:30pm PUBLIC CONFIRMED #BBuzz https://program.berlinbuzzwords.de/bbuzz26/talk/NUWNJU/ Palais Atelier Nick Burch PUBLISH 7X787J@@program.berlinbuzzwords.de

-7X787J

Opening Session en

20260608T093000 20260608T093500 000500

Opening Session

- PUBLIC CONFIRMED #BBuzz https://program.berlinbuzzwords.de/bbuzz26/talk/7X787J/ Kesselhaus Paul Berschick PUBLISH TMP3LK@@program.berlinbuzzwords.de

-TMP3LK

Building Resilience: The Next Decade of Open Source en

20260608T093500 20260608T102000 004500

Building Resilience: The Next Decade of Open Source

Quietly over the course of 25 years, open source software evolved from a domain perceived as that of only hobbyists into the invaluable backbone of modern digital infrastructure. Sustaining that success for the future will require more than code. It requires resilience: a trait not of technology. but of people. Of community. With increasing regulation around the world, evolving cybersecurity requirements, burnt out contributors, and stagnant corporate participation and funding, how do we ensure the ecosystem's continued success? The things that have worked for the first decades will not be the things that keep us going. Let's look together at sustainability not only as a funding problem, but also from the perspectives of global policy changes, security, and other intertwined issues that face open source in the coming years. PUBLIC CONFIRMED Keynote https://program.berlinbuzzwords.de/bbuzz26/talk/TMP3LK/ Kesselhaus Ruth Suehle PUBLISH KTRN8U@@program.berlinbuzzwords.de

-KTRN8U

Low-Resource Languages as Stress Tests for NLP Data en

20260608T104000 20260608T110000 002000

Low-Resource Languages as Stress Tests for NLP Data

This talk is an experience report on annotating language data in a low-resource setting and what this process reveals about data quality in NLP pipelines. Rather than treating low-resource languages as edge cases, the talk frames them as stress tests that make structural data issues visible early and clearly. The session outlines what linguistic fieldwork data looks like before it becomes “training data,” highlighting ambiguity, context dependence, and variation that cannot always be resolved through additional labeling. It then focuses on the annotation decisions required when categories are underspecified or multiple analyses are plausible, and connects these challenges to familiar issues in applied NLP, such as label noise, brittle representations, and unexpected model behavior. The goal is to share practical lessons from linguistic data work that help NLP practitioners reason more realistically about annotation, uncertainty, and robustness. Attendees will gain concrete insights into why “clean data” is often an illusion and how early data decisions shape downstream systems. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/KTRN8U/ Kesselhaus Priscilla Lola Adenuga PUBLISH ANZCWN@@program.berlinbuzzwords.de

-ANZCWN

Dynamic Broker-Side Filtering for Kafka en

20260608T111000 20260608T115000 004000

Dynamic Broker-Side Filtering for Kafka

Watch a live implementation of broker-side filtering that solves a 7-year-old debate. You'll see working code, performance benchmarks, and real production deployments from financial services and logistics. Leave with a Kafka-compatible solution you can deploy immediately on StreamNative's platform with sub-millisecond filtering that cuts network traffic by 60-80%. Perfect for real-time analytics and compliance monitoring. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/ANZCWN/ Kesselhaus David Kjerrumgaard Álvaro Rodríguez PUBLISH XULYFE@@program.berlinbuzzwords.de

-XULYFE

The Agent Era: How AI Agents Are Reshaping Data Platforms en

20260608T120000 20260608T124500 004500

The Agent Era: How AI Agents Are Reshaping Data Platforms

Autonomous AI agents are becoming first-class users of data infrastructure and most data platforms weren't designed for them. This panel brings together engineers from Snowflake, Elastic, ClickHouse, and Xata to have an honest conversation about what that collision looks like in practice. Each platform brings a different angle: cloud-scale warehousing, search and observability, real-time analytics, and Postgres. They'll explore what it concretely means to make a data platform agent-ready, from query reliability to access control to the performance characteristics that agentic loops require. Note: I confirmed with a few guest, but once the panel is approved, I can confirm with other leaders from data platforms to join the panel. If we can schedule it on Monday, the Head of devrel from Elastic is able to join. PUBLIC CONFIRMED Panel https://program.berlinbuzzwords.de/bbuzz26/talk/XULYFE/ Kesselhaus Monica Sarbu Danica Fine Philipp Krenn PUBLISH PLNTP9@@program.berlinbuzzwords.de

-PLNTP9

Agentic Retrieval: Building Self-Optimizing Search Systems en

20260608T140000 20260608T144000 004000

Agentic Retrieval: Building Self-Optimizing Search Systems

Relevance feedback loops used to take months. Developers would capture interaction data, train models offline, and push updates through slow deployment cycles. The arrival of AI agents as a new class of search user has compressed this cycle to seconds. In agentic workflows, retrieval is no longer a single tool call that returns results; it is a tight, iterative loop where the agent refines its own queries, evaluates result quality, and tries again. This talk goes beyond basic retrieval-augmented generation (RAG) to explore what comes next: *Agentic Retrieval*. We are entering a paradigm where agents don't just reformulate queries, but dynamically adjust the retrieval system itself, tuning scoring models, modifying schema configurations, and making indexing decisions to match the specific demands of a task. This is the logical extreme of the feedback loop: a self-reinforcing system where the agent optimizes its own context window. We will present the infrastructure principles that make this possible, drawing on our work building agent-native retrieval at Hornet. The talk covers: * **Schema-first API design** that gives agents a structured, predictable interface to work with * **Verifiable state changes** that let agents confirm the effect of their own modifications * **RL-compatible feedback signals** that enable agents to self-correct rather than relying on human-in-the-loop tuning Attendees will leave with a concrete understanding of how to architect a retrieval stack where agents can tune their own environment in real time, and why the shift from human-facing search to agent-facing retrieval infrastructure demands fundamentally different design choices. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/PLNTP9/ Kesselhaus Skip Everling Jo Kristian Bergum PUBLISH E7HE9V@@program.berlinbuzzwords.de

-E7HE9V

Ultraviolet: Turn Hidden Document Data into an AI Advantage en

20260608T145000 20260608T151000 002000

Ultraviolet: Turn Hidden Document Data into an AI Advantage

Artificial intelligence is no longer only something we build — it is something we design for. As AI systems increasingly mediate how users access information, make decisions, and interact with digital products, a new role is emerging: designing how intelligence itself is perceived, trusted, and behaves in real-world environments. This perspective becomes especially critical when AI systems depend on complex information artifacts such as documents. Documents remain one of the primary means of information exchange across industries, with PDFs alone accounting for billions of files generated each year. Despite their ubiquity, PDFs are often treated merely as containers of visible text and images. In reality, they encapsulate a much richer and more complex internal structure, including annotations, cross-references, accessibility artifacts (such as alternate text), hidden or layered content, embedded attachments, metadata, and other non-obvious elements. These components are largely invisible to users, yet they can have a profound impact on downstream artificial intelligence systems. This talk explores how agentic workflows, automated information extraction, and retrieval-augmented generation (RAG) can be influenced, or even exploited by the way PDF internals are interpreted. We examine the types of hidden information that can be found or intentionally included within PDFs, how parsers and document processing tools handle (or ignore) this information. We further investigate the risks and opportunities associated with PDF metadata and hidden content. On one hand, poorly handled metadata can introduce vulnerabilities, including malicious data-injection attacks that target AI pipelines at the document layer. On the other hand, these same mechanisms may offer untapped potential: can documents embed structured signals, pre-computed representations, or even vector-like information that could enhance retrieval, indexing, or storage? Could documents themselves act as intelligent carriers of contextual knowledge? Using practical examples, the talk aims to make “visible” the “invisible” layer behind visualized text and images, and its interaction with AI systems. Framed through the lens of AI experience design, we discuss what it means to make content truly AI-ready, why structure and intent matter when information is consumed by both humans and machines, and how responsible design can improve reliability, transparency, and control. Participants will gain a deeper understanding of how hidden document structures affect AI behavior, how to safeguard pipelines against adversarial or accidental misuse, and how to responsibly leverage document internals to build more robust, trustworthy, and intentionally designed AI-powered knowledge systems. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/E7HE9V/ Kesselhaus Alessio Vertemati PUBLISH WCAJ99@@program.berlinbuzzwords.de

-WCAJ99

How Apache Iceberg Enables Multi-Engine Data Platforms en

20260608T152000 20260608T160000 004000

How Apache Iceberg Enables Multi-Engine Data Platforms

Modern data platforms increasingly rely on multiple compute engines to serve diverse workloads, from batch analytics to interactive SQL and streaming. Without a shared table layer, this flexibility often leads to duplicated data, inconsistent results, and operational complexity. Apache Iceberg provides a common table abstraction that decouples storage from compute, enabling multiple engines such as Spark, Trino, and Flink to operate safely on the same data. This talk explores the architectural patterns that make multi-engine platforms possible, including metadata-driven concurrency, snapshot isolation, and schema evolution. We’ll discuss how to choose the right engine for different workloads, how catalogs act as the coordination layer, and what operational practices are required to maintain performance and consistency at scale. Attendees will leave with practical guidance for designing open, multi-engine data architectures built on Apache Iceberg PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/WCAJ99/ Kesselhaus Geetha Anne PUBLISH GEHRDC@@program.berlinbuzzwords.de

-GEHRDC

10x CouchDB Performance Gains for a AAA Game Launch en

20260608T163000 20260608T171000 004000

10x CouchDB Performance Gains for a AAA Game Launch

This talk will take the attendee on a performance tuning journey. From benchmarking fundamentals as the foundation, we are going through six distinct steps of always finding the next bottleneck in a large distributed cluster setup of CouchDB. We will cover, in-depth, ways to measure and improve: - Disk I/O - HTTP request and response times - TCP Accept handling - CPU Utilisation and Process Scheduling in an Erlang system - Erlang cluster communication networking In the end, our client successfully launched their latest version of a AAA sports game with capacity to spare. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/GEHRDC/ Kesselhaus Jan Lehnardt PUBLISH SSYYQ8@@program.berlinbuzzwords.de

-SSYYQ8

AI is here – time to throw away our search engines? en

20260608T172000 20260608T180500 004500

AI is here – time to throw away our search engines?

AI has revolutionised the world of search – first by giving us better ways to understand language, rewrite content and provide single answers, and latterly with augmented coding techniques & AI agents to configure our engines & run our searches for us. If you're working on search applications today you're probably looking at AI techniques first - but there's decades of work behind the traditional search techniques that you can't afford to ignore. Our panel, which includes leading experts on both old-school and new search techniques, will help you decide how to combine the best of both worlds. PUBLIC CONFIRMED Panel https://program.berlinbuzzwords.de/bbuzz26/talk/SSYYQ8/ Kesselhaus Charlie Hull Atita Arora Jo Kristian Bergum Dmitry Kan Evgeniya Sukhodolskaya PUBLISH TYGHXR@@program.berlinbuzzwords.de

-TYGHXR

Get-Together en

20260608T180500 20260608T210500 030000

Get-Together

- PUBLIC CONFIRMED #BBuzz https://program.berlinbuzzwords.de/bbuzz26/talk/TYGHXR/ Kesselhaus PUBLISH 9STDXK@@program.berlinbuzzwords.de

-9STDXK

OpenSearch Software Foundation: 1 Year of Open Governance en

20260608T104000 20260608T110000 002000

OpenSearch Software Foundation: 1 Year of Open Governance

In September 2024, the OpenSearch community announced the formation of this new home for the project, the OpenSearch Software Foundation, and since then we’ve successfully transitioned to the Linux Foundation's technical and governance stack. Our mission is to empower users to navigate the OpenSearch ecosystem, recruit skilled talent, and adopt the platform effectively, all while supporting sustainable open source innovation. Over this period, the OpenSearch community has demonstrated remarkable momentum. We’ve seen more than 8,800 contributions, driven by a vibrant and growing community of over 3,300 individual contributors from more than 400 organizations. This surge in activity has placed OpenSearch among the top 20 most active projects across the entire Linux Foundation ecosystem by contributor engagement. We’ve collaborated closely with our community and members on key initiatives and foundational work during this transition. Join us to hear about the journey so far and the future path for the OpenSearch project and Foundation. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/9STDXK/ Maschinenhaus Kris Freedain Carlos Rolo PUBLISH 7ATP3V@@program.berlinbuzzwords.de

-7ATP3V

Apache Solr 10: What's Coming up for Vector Search en

20260608T111000 20260608T115000 004000

Apache Solr 10: What's Coming up for Vector Search

Apache Solr 10 introduces many advancements in the realm of vector search, making many interesting Lucene features surface. Starting from scalar and binary quantization, this feature helps users in reducing both the query time and memory footprint at the cost of some accuracy and disk space: a welcome trade-off for those using Solr on massive amounts of vectors. Early termination introduces the ability of speeding up certain queries that saturate a configurable threshold, and Seeded KNN gives the ability to start the HNSW graph exploration from a lexical result set, rather than random entry documents (core mechanism of the Solr vector search implementation). ACORN filtering improves the way pre-filtering happens when you mix traditional keyword searches with knn queries, and the query combiner finally offers a comprehensive strategy to mix up query results, opening the door to a more flexible hybrid search. To conclude with a cherry on top of the cake, we'll go through many bug fixes and minor improvements, still worth mentioning. The audience is expected to get an overview of all the new interesting vector search features coming with Solr 10 and learn how to use them and benefit from them in their use cases. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/7ATP3V/ Maschinenhaus Alessandro Benedetti Ilaria Petreti Anna Ruggero PUBLISH MCH7ZZ@@program.berlinbuzzwords.de

-MCH7ZZ

Constant-Time Aggregations with Star-Tree in OpenSearch en

20260608T120000 20260608T124000 004000

Constant-Time Aggregations with Star-Tree in OpenSearch

Traditional distributed search engines face a significant bottleneck: aggregation latency scales linearly with document count. As datasets grow to billions of records, this "scan-on-query" model fails to meet real-time requirements. Inspired by Apache Pinot and star-cube research, OpenSearch introduced the Star-Tree index to decouple performance from raw data volume. This session dives into the engineering behind this transition. We will explore how we extended Lucene’s DocValuesFormat to support multi-field materialized views directly within segment structures and shifted the performance dependency from total document count to the cardinality of indexed dimensions. We will detail the implementation of "star nodes"—wildcard structures representing aggregates across all values of a dimension—and how they enable constant-time query pruning. Attendees will learn about the challenges of building a multi-field index in the single-field-centric ecosystems like Lucene, how to fine tune storage versus speed, and why this architectural shift achieved up to 1000x faster queries. Finally, we will discuss operational lessons learned, including limitations on why this structure is limited to append-only workloads and how it bridges the gap between traditional search and OLAP-style analytics. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/MCH7ZZ/ Maschinenhaus Sandesh Kumar Shailesh Kumar Singh PUBLISH TSMVSN@@program.berlinbuzzwords.de

-TSMVSN

Turning the database inside out again en

20260608T140000 20260608T144000 004000

Turning the database inside out again

Over a decade ago, Martin Kleppmann’s Turning the Database Inside Out challenged us to rethink data systems from first principles—placing the event stream at the center of storage, computation, and truth. That vision sparked an entire ecosystem of event-driven architectures, real-time analytics systems, and stream-aware databases. But what if that journey is still unfinished? This talk explores the next leap: reimagining the database itself through the lens of streaming. Instead of treating the event log as a narrow integration pipe, we’ll treat it as the core substrate for all data—augmented with the essential primitives that traditional databases provide: long-term storage, indexing, and rich materializations. To get there, we move beyond simple append/consume patterns and embrace modern table formats and storage layers capable of making event data durable, queryable, and universally accessible. The result is an architecture that collapses fragile pipelines, dissolves the boundary between real-time and historical processing, and provides a unified view of the world using widely adopted open standards (Apache Kafka and Apache Iceberg) This changes the question from “what’s happening right now?” to “what has happened across the entire lifespan of the system?”. You’ll walk away seeing both the database—and the stream—through a fundamentally new lens. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/TSMVSN/ Maschinenhaus Tom Scott Roman Kolesnev PUBLISH ULE3MU@@program.berlinbuzzwords.de

-ULE3MU

From OLTP to OLAP: Is PostgreSQL Eating Analytics Too? en

20260608T145000 20260608T151000 002000

From OLTP to OLAP: Is PostgreSQL Eating Analytics Too?

Traditionally a row-oriented OLTP system, PostgreSQL is now gaining columnar capabilities through extensions such as Citues, TigerData columnar, pg_duckdb and more built on PostgreSQL’s pluggable storage layer. This raises a serious architectural question: can PostgreSQL evolve into a competitive analytical engine? In this talk, we provide a structured overview of the current PostgreSQL columnar ecosystem — how these extensions work, what features they offer, and where they differ in terms of compression, execution model, and performance. We place these developments in the broader context of modern database trends: HTAP ambitions, consolidation of data stacks, and the gravitational pull of PostgreSQL as a platform. Finally, we discuss selected performance observations and architectural considerations when comparing columnar PostgreSQL setups to established analytical systems such as ClickHouse from a technical exploration of trade-offs. Is PostgreSQL becoming a universal data platform, or are there structural limits to how far columnar extensions can take it? PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/ULE3MU/ Maschinenhaus Daniel Seybold PUBLISH B9TVRQ@@program.berlinbuzzwords.de

-B9TVRQ

Streamling: Lightweight, Extensible Streaming on DataFusion en

20260608T152000 20260608T160000 004000

Streamling: Lightweight, Extensible Streaming on DataFusion

Stream processing systems are complex. Our previous platform was Flink-based. We learned a lot from it, but wanted a lighter approach for workloads that do not need distributed stateful processing. At the same time, a growing ecosystem was emerging around Apache DataFusion and Arrow. We built Streamling to explore a specific point in this design space: a production streaming engine that stays intentionally simple, with no distributed shuffle and no stateful joins, and focuses on operational clarity, extensibility, and cloud-native deployment. **Part 1: The Engine Internals** A deep dive into how we extended DataFusion for streaming: - **Streaming SQL on DataFusion**: We use DataFusion's query planner, custom `TableProvider`s, and `ExecutionPlan` traits to process Kafka Avro data as continuous Arrow `RecordBatch` streams. - **Checkpoint coordination**: A lightweight Chandy–Lamport style protocol (Marker → Ack → Finalizer) that guarantees at-least-once delivery. State is persisted via a pluggable backend system (in-memory, SQLite, or PostgreSQL in production), keeping checkpoint storage decoupled from the engine itself. - **Runtime extensibility**: WebAssembly script transforms (JS/TS via Extism), HTTP handler transforms, and an `abi_stable` plugin system provide FFI-safe, language-agnostic extension points without requiring engine forks. - **Dynamic tables**: Stateful lookup tables can be populated from streams or updated externally (for example, in Postgres), enabling deduplication and enrichment in SQL through custom UDFs without pipeline restarts. **Part 2: From Engine to Platform** How we designed the system for production cloud deployment: - **Control/data plane separation**: The engine (data plane) is decoupled from orchestration (control plane), enabling both fully managed and BYOC (Bring Your Own Cloud) deployment models. - **Kubernetes-native lifecycle**: Pipeline management (create, pause, resume, restart), resource sizing, secret injection, and namespace isolation. - **Clean separation**: Why defining this boundary early keeps the engine portable and the platform flexible across deployment models. **Key takeaways for the audience:** 1. DataFusion is proving to be a versatile foundation for streaming, not just batch. We'll share a brief overview of the landscape and where different projects sit. 2. You don't need distributed stateful processing for many streaming workloads. Deliberately scoping down unlocks operational simplicity. 3. Designing a clean control/data plane boundary from day one keeps your architecture flexible for different deployment models. This talk is aimed at engineers building or evaluating streaming platforms, and anyone exploring DataFusion beyond batch analytics. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/B9TVRQ/ Maschinenhaus Xiao Meng Rafael Aguiar PUBLISH 77KQHG@@program.berlinbuzzwords.de

-77KQHG

Search is Back: Solving the "Context Crisis" for AI Agents en

20260608T163000 20260608T171000 004000

Search is Back: Solving the "Context Crisis" for AI Agents

The challenges of building effective AI systems today echo the early days of web search: we are still struggling to get the context right. But where we once had simple user queries, we now have complex agents demanding structured context that scales faster than any past user. The key to context has always been finding ways to accumulate, structure, and recall relevant knowledge across interactions, at the point where knowledge graphs and vector search converge. In this talk, we connect the dots between the old problem of user context and the new reality of Agentic AI, and show accessible ways to create context for both people and programs. What you will learn in this session while we walk through a solution built only with open-source tools: 1) How to deliver meaningful context to people and agents 2) The tradeoffs between common retrieval approaches 3) Practical patterns you can apply to build more reliable AI applications PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/77KQHG/ Maschinenhaus David Louis Hollembaek Vincent Pistor PUBLISH XPADMH@@program.berlinbuzzwords.de

-XPADMH

Building a Local News RAG: The Quest for Trustworthiness en

20260608T172000 20260608T180000 004000

Building a Local News RAG: The Quest for Trustworthiness

Building a RAG system for a local newspaper is a high-stakes challenge where "hallucinations" aren't just bugs—they are threats to the brand's main currency: trust. In this session, we share our unfiltered journey of moving beyond "clean" documentation into the messy reality of local news. We’ll dive into the "Long Tail of Locality," exploring how to handle hyper-local contexts (villages and regional politics) that LLMs have nearly no knowledge of. We will discuss the problem of semantic collisions—where standard hybrid search fails to distinguish between dozens of nearly identical weekly football reports—and how we navigate complex customer expectations and unclear usage patterns. From the architectural nightmare of structuring legacy news data to the ongoing battle for factual reliability, this is a talk about what worked, what we still haven't fixed, and the hard lessons learned when "state-of-the-art" AI meets the local beat. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/XPADMH/ Maschinenhaus Marcel Dokters PUBLISH 9MJAUU@@program.berlinbuzzwords.de

-9MJAUU

Relevance Feedback Inside the Search Engine en

20260608T104000 20260608T110000 002000

Relevance Feedback Inside the Search Engine

The relevance of search results is a use-case-dependent, capricious metric. Without access to the full dataset and visibility into the search algorithm, getting relevant results means either guessing the right query formulation or search engineers squeezing out the reranking (or context) budget to compensate for the search algorithm's required simplicity at scale. What if your retriever could be guided by relevance feedback signals from a smart model (like a reranker or even a search agent) during the search process itself, achieving higher recall and discoverability of relevant results at a reasonable cost? In this talk, I'll present our API for distilling relevance feedback from smart models directly into the vector search index. — This talk is sponsored by [Qdrant](https://qdrant.tech) PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/9MJAUU/ Palais Atelier Evgeniya Sukhodolskaya PUBLISH M7T8T3@@program.berlinbuzzwords.de

-M7T8T3

Mentoring In Open Source in the Age of AI en

20260608T111000 20260608T115000 004000

Mentoring In Open Source in the Age of AI

We've both been mentoring open source contributors through Outreachy for a few years. Tilda coordinates mentors globally, and we've both been mentors and interns. We thought we had this figured out. Then AI showed up and broke everything. Contributors started submitting perfect code they couldn't explain. PRs looked great, but ask someone to modify their own work and they'd freeze up. We realized people were using ChatGPT, and none of us—including the contributors themselves—could tell anymore what they'd actually learned versus what they'd generated. We had to completely rethink how we mentor. What we tried that didn't work: - Asking "Did you use AI?" got us nowhere. People felt defensive or genuinely didn't know if they'd learned something. - Treating AI code like copy-pasted Stack Overflow didn't work either; the volume and polish were totally different. - Trying to detect AI-generated code was pointless. We don't care if they used AI. We care if they learned. What actually worked: - We changed our code review questions from "Does this work?" to "Why this approach?" and "What happens if we change this?" The answers told us everything. - We restructured tasks: less "implement this feature" and more "solve this problem, explain your thinking, then build it." - We did more live pairing. It's hard to hide what you don't understand in real time. - We taught people to use AI for learning (asking it to explain concepts) instead of just generating solutions. This isn't solved—we're still figuring stuff out. But we've tried a lot and can tell you what worked and what didn't. But we know for sure that you'll leave with concrete techniques you can use immediately if you're mentoring contributors, teaching programming, or helping anyone learn when AI's in the picture. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/M7T8T3/ Palais Atelier Tilda Udufo Busayo Ojo PUBLISH JYQZ8Y@@program.berlinbuzzwords.de

-JYQZ8Y

Beyond the Hype: When Apache Flink Solves Real Problems en

20260608T120000 20260608T124000 004000

Beyond the Hype: When Apache Flink Solves Real Problems

Apache Flink promises powerful stream processing, but when does that power translate to actual business value? This session provides the architectural clarity engineers need by focusing on specific use cases where Flink becomes essential versus scenarios where simpler alternatives suffice. Attendees will explore real-world problems that demand Flink's stateful processing and exactly-once guarantees—fraud detection, real-time recommendations, CDC-driven data lakes—contrasted with situations where batch jobs or Kafka Streams are better fits. The talk draws practical distinctions between stream processing engines (Flink versus Spark) and streaming platforms (Kafka with ksqlDB/Kstreams, Pulsar), clarifying when each architectural pattern shines. Engineers will leave equipped to confidently decide when streaming architecture delivers results and when it's unnecessary complexity. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/JYQZ8Y/ Palais Atelier Naci Simsek PUBLISH L98Q7L@@program.berlinbuzzwords.de

-L98Q7L

Apache Spark Declarative Pipelines in Action en

20260608T140000 20260608T144000 004000

Apache Spark Declarative Pipelines in Action

Spark Declarative Pipelines: Building Data Workflows with Spark 4.1's Game-Changing Feature Apache Spark 4.1 introduces Spark Declarative Pipelines (SDP), a paradigm shift that transforms how data engineers design and maintain complex data workflows. This hands-on session provides a comprehensive introduction to SDP, demonstrating how declarative configuration can replace traditional imperative Spark code for common data pipeline patterns. I will present a live example using an open-sourced PySpark data source I built with OpenSky founders from Oxford and ETH Zurich. In just a few lines of code, you'll create a continuous data pipeline with streaming tables ingesting real ADS-B flight data from aircraft overhead—from tiny Cessnas to massive Airbus A380s. No complex "glue code" for incremental ingestion—just define what your pipeline should do while Spark figures out how to do it. Using streaming tables and materialized views, we'll layer on AI-powered analytics, turning natural language questions like "Show me flights above 30,000 feet over California" into instant SQL queries against live crowdsourced IoT data. I'll demonstrate with a forever-free cloud environment where every attendee can replicate this example hands-on. Attendees will leave with practical knowledge to immediately begin experimenting with SDP and best practices for modernizing their pipeline development. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/L98Q7L/ Palais Atelier Frank Munz PUBLISH ZCF89D@@program.berlinbuzzwords.de

-ZCF89D

Why Choose One: Multi-Engine Analytics with Apache Wayang en

20260608T152000 20260608T160000 004000

Why Choose One: Multi-Engine Analytics with Apache Wayang

Modern analytics pipelines frequently span databases, big data engines, and machine learning frameworks. Connecting these systems manually leads to complex orchestration, high data movement cost, and platform-specific rewrites. This challenge also appears in agent driven workflows where different steps of a task naturally map to different engines. Apache Wayang is a recently graduated Apache Top Level Project that provides a unified data analytics framework for cross-platform execution. Pipelines are expressed with platform independent operators using Java, Scala, Python, or SQL APIs. A cross-platform optimizer then maps operators to execution backends such as Spark, Flink, JDBC databases, and ML systems, and produces execution plans that may span multiple engines. It models operator and data movement cost and supports runtime re optimization when estimates are wrong. In practice, this lets developers write a pipeline once and run it efficiently across multiple engines without hard-wiring platform choices. The talk is technical and system focused, aimed at practitioners working with heterogeneous data stacks. It has three parts: 1. Motivation (10-15min) Why single engine execution is often not enough. Concrete ETL, ML, and agent-based workflows that require multiple systems and create optimization and integration challenges. 2. System architecture and optimizer (20-25min) Wayang’s platform agnostic plans, operator mappings, cross-platform data movement handling, and stage-based execution model. How the cost-based optimizer inflates plans, evaluates alternatives, and selects mixed engine execution strategies. Brief coverage of SQL, ML, and multi-language UDF support. 3. Project history, status, and next steps (5-10min) From multi year cross-platform analytics research to Apache and recent Top Level Project graduation. Extensibility for new platforms and current work on improved cost models and optimizer enhancements. Attendees will gain a practical understanding of how cross-platform analytics can be executed efficiently and how to design pipelines that are not locked to a single processing engine. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/ZCF89D/ Palais Atelier Zoi Kaoudi Haralampos Gavriilidis PUBLISH U3NJ9P@@program.berlinbuzzwords.de

-U3NJ9P

Event-driven Agents with Complex Event Processing in Flink en

20260608T163000 20260608T171000 004000

Event-driven Agents with Complex Event Processing in Flink

Specialized, event-driven AI Agents, in contrast to planning agents, provide unique value for continuously monitoring real-time event pipelines, business processes or technical logs in Apache Kafka. Streaming Agents can invoke LLMs directly from Flink for each event, but this approach can be very costly for high-volume Kafka topics, and lead to non-deterministic outcomes. We showcase how Streaming Agents can be combined with Pattern Recognition and Anomaly Detection in Apache Flink in smart ways to increase cost efficiency, avoid hallucinations and enforce predictable, deterministic behavior. High-volume event pipelines can be filtered very efficiently with Complex Event Processing (CEP) as a core library in Apache Flink for pattern recognition of sequences of events, as well as traditional ML models with statistical approaches to detect anomalies for critical errors and business opportunities. Streaming Agents can then invoke LLMs in a second step, to classify or further analyze the detected patterns or anomalies, suggesting or triggering actions. Due to these specialized tasks, small models often perform great in this context to achieve deterministic outcomes. Specifically in a business process context, this architecture provides opportunities for real-time process mining for ERP, manufacturing, supply chain and financial data to detect process issues and SLA violations earlier, reducing down time and saving costs by taking action immediately. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/U3NJ9P/ Palais Atelier Steffen Hoellinger PUBLISH 3A9DSM@@program.berlinbuzzwords.de

-3A9DSM

Floe: Policy-Based Table Maintenance for Apache Iceberg en

20260608T172000 20260608T180000 004000

Floe: Policy-Based Table Maintenance for Apache Iceberg

Every Iceberg table needs maintenance, but catalogs don't execute and engines don't orchestrate. Teams end up with scripts that become DAGs that become technical debt. Nobody knows which tables are healthy, which are overdue, or what ran last. Floe is an open-source, policy-based maintenance system for Iceberg. Define rules with glob patterns, schedules, and health-driven triggers that gate operations based on real table metrics: small file percentage, snapshot count, delete file ratio, and partition skew. Priority resolves conflicts when patterns overlap. A maintenance debt score ranks tables by urgency so the most critical work runs first within your resource budget. Floe connects to REST, Polaris, Lakekeeper, Gravitino, DataHub, Hive Metastore, and Nessie catalogs, then delegates execution to Spark or Trino. A built-in dashboard shows table health trends, operation history, and policy coverage. This talk covers the policy model, health-driven maintenance planning, and a live demo. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/3A9DSM/ Palais Atelier Neelesh Salian PUBLISH 37ZLHV@@program.berlinbuzzwords.de

-37ZLHV

Towards Chunk-less RAG en

20260608T104000 20260608T110000 002000

Towards Chunk-less RAG

RAG systems have become foundational for grounding LLM outputs in factual knowledge, but they share a common limitation: semantic search operates at the chunk level, not the token level. This talk presents an experimental investigation into whether we can bypass chunking entirely by extracting token-level relevance directly from dense embedding models. The core insight is simple, by preventing the embedding pooling step and computing cosine similarity between every query token and document token, we can generate relevance heatmaps that highlight exactly which spans matter for a given query, from which we can extract relevant text spans. The session will walk through the complete pipeline: * Extracting token-level embeddings from the last hidden layer of dense embedding models (specifically Qwen3-Embedding-0.6B) * Computing relevance matrices via normalized dot products between query and document token vectors * Collapsing multi-token query representations into per-document-token scores * Designing a clustering algorithm that identifies relevance peaks, groups nearby high-scoring tokens, and extends matches to semantic boundaries * Comparing results against purpose-built late-interaction models (ColBERT variants, Jina Embeddings v4) The experimental results reveal that the extracted spans show strong F1 scores when evaluated against ground truth answers in test documents. And the comparison between models shows that, despite being trained for pooled sentence embeddings, Qwen3's token-level representations outperform ColBERT-style models specifically designed for multi-vector matching. However, the approach surfaces two major challenges: storage requirements balloon by roughly 900× compared to traditional chunking and the model's decoder-only architecture creates attention patterns that bias relevance toward document endings. This is explicitly experimental work shared in the spirit of exploring new directions, not presenting a production solution. The goal is to spark discussion about whether the chunking paradigm is a necessary constraint or an artifact of current tooling, what modifications to model training or inference could make span-level retrieval practical at scale, and the parallelism between this approach and promising knowledge graph retrieval strategies. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/37ZLHV/ Frannz Salon Carles Onielfa PUBLISH Y7YVKP@@program.berlinbuzzwords.de

-Y7YVKP

No 0-day required, just target the AI coding assistant! en

20260608T111000 20260608T115000 004000

No 0-day required, just target the AI coding assistant!

Do you trust your AI coding assistant? What if I told you that attackers have found ways to manipulate it and attack your code? With everyone now using AI coding assistants it’s time to look at the risks! During this talk I’ll show you several new techniques attackers are already using. This will range from hidden messages (ASCII smuggling) to abusing mistyping and characters that look the same (typosquatting). I will also show how an LLM can make mistakes when generating code (hallucinations). Did you know that a smart attacker can abuse this too? When you join this talk, you’ll learn how to spot hidden text in your instruction file and prompts. I will also explain how to set up a trusted dependency repository to prevent the wrong code from entering your production environment! PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/Y7YVKP/ Frannz Salon Leo Visser PUBLISH AT33SV@@program.berlinbuzzwords.de

-AT33SV

OSS Security: Lessons from 10+ Years at Apache Solr en

20260608T120000 20260608T124000 004000

OSS Security: Lessons from 10+ Years at Apache Solr

The security landscape is ever-evolving; as threats emerge and best practices shift, open source projects must balance backwards-compatibility and their own volunteer-driven nature against the practical needs of modern security. For Apache Solr, a project that began without built-in authentication or authorization, this journey has been particularly instructive. This talk traces the evolution of security in Apache Solr from its early days through the present. We'll examine the major inflection points that shaped the project's security posture: the introduction of a pluggable authentication and authorization framework, the consideration of alternatives like Apache Shiro, formative CVE reports that exposed critical vulnerabilities, and significant deprecations like the Data Import Handler ("DIH") that sacrificed popular features for security. Along the way, we'll discuss the community processes and dynamics involved in each decision, along with the trade-offs of major choices (e.g. breaking changes vs. user safety). By the end of this talk, attendees will understand how security priorities have evolved in a major open source project and gain insights and examples (both good and bad!) to take back to their own applications and projects. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/AT33SV/ Frannz Salon Jason Gerlowski PUBLISH LQBMVZ@@program.berlinbuzzwords.de

-LQBMVZ

Livecoding Data Visualisations with Streamlit en

20260608T140000 20260608T144000 004000

Livecoding Data Visualisations with Streamlit

It's far too hard to visualise data. If you've got some data you want to share with people, it shouldn't need a React expert just to generate a chart. It shouldn't take a 3-tier architecture to give people an interactive view they can explore. But all too often it does. This is where Streamlit hits a design sweet spot. It's a simple framework that makes it incredibly easy to start with regular Python data-processing code, and get to a clean, professional visualisation, in minutes. Even if you're not a "frontend person", you can get a polished, interactive user interface in front of people with just a few extra lines of code. In this live-coding session you'll learn everything you need to get started with Streamlit. We'll start completely from scratch, explore the core parts of Streamlit's API, and see how any backend developer or data scientist can actually *show* their work faster than you can say, "JSON encoding". PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/LQBMVZ/ Frannz Salon Kris Jenkins PUBLISH ERWWUY@@program.berlinbuzzwords.de

-ERWWUY

Scientific Data Under Threat in Today’s America en

20260608T145000 20260608T151000 002000

Scientific Data Under Threat in Today’s America

What happens when the data that is used by climate research, public health, and civil rights enforcement becomes politically inconvenient? This talk examines the vulnerability of scientific data during the presidency of Donald Trump, a period marked by the removal of government web pages, restrictions on agency communications, and proposed budget cuts to research institutions. The talk highlights the role of [PANGAEA – Data Publisher for Earth & Environmental Science](https://www.pangaea.de/) in ensuring long-term preservation and open access to geoscientific datasets, demonstrating how trusted repositories can safeguard publicly funded research and make it globally accessible despite shifting political climates. By assigning persistent identifiers (DOIs), rich metadata, use of terminologies, standardized formats, and open licenses, data gets FAIR — Findable, Accessible, Interoperable, and Reusable — and readily integrable into computational and AI workflows, enabling large-scale analysis, machine learning applications, and reproducible science across disciplines. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/ERWWUY/ Frannz Salon Uwe Schindler PUBLISH YWLB7A@@program.berlinbuzzwords.de

-YWLB7A

Let LLMs Wander: Engineering RL Environments en

20260608T152000 20260608T160000 004000

Let LLMs Wander: Engineering RL Environments

Since the release of reasoning Language Models like DeepSeek R1, improving model capabilities is moving beyond static examples (Supervised Fine-Tuning) to interaction via Reinforcement Learning. To enable this, we need **RL Environments**: controlled worlds where models can act, get rewards, and learn. An environment is more than a dataset. It is a piece of software that orchestrates interactions with the model, manages state, defines rewards, and verifies outcomes. In this talk, I will walk you through my journey exploring this emerging space from a software engineering perspective. 1. I will start by mapping classic Reinforcement Learning concepts to Language Models. 2. I will then introduce **Verifiers**, an open-source library for building environments as software artifacts. 3. Based on Verifiers, we’ll see concrete **design patterns** that range from simple single-turn tasks, to multi-turn games, to environments for tool-using agents that interact with external systems. 4. I’ll share practical experiences using environments for **evaluation and training Small Language Models**. By the end of the session, attendees will be able to start building their own Reinforcement Learning environments, little worlds for LLMs. I'll also share the joys, frustrations, and lessons learned along the way. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/YWLB7A/ Frannz Salon Stefano Fiorucci PUBLISH BGDMFD@@program.berlinbuzzwords.de

-BGDMFD

SPRUCE it up! Open Source GreenOps at scale en

20260608T163000 20260608T171000 004000

SPRUCE it up! Open Source GreenOps at scale

The environmental impact of ICT—and cloud computing in particular—is rapidly increasing, driven largely by the rise of AI workloads. While **FinOps** has become a standard practice for managing cloud costs, its environmental counterpart, **GreenOps**, is still struggling to gain traction. A key obstacle is the lack of transparent, actionable sustainability data from cloud service providers. In this talk, we introduce [SPRUCE](https://opensourcegreenops.cloud/), a scalable open-source platform designed to implement GreenOps at scale. Built on Apache Spark and leveraging open data and models, SPRUCE processes large volumes of cloud usage data, enriching provider reports to quantify environmental impact and generate actionable insights through reports and visualisations. Attendees will gain a practical understanding of GreenOps, the current data and tooling landscape, and how a big-data–driven approach with Apache Spark can help teams measure and reduce both cloud carbon emissions and costs. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/BGDMFD/ Frannz Salon Julien Nioche PUBLISH QEXDKB@@program.berlinbuzzwords.de

-QEXDKB

Observability’s Sixth Sense: Detecting Anomalies in Metrics en

20260608T172000 20260608T180000 004000

Observability’s Sixth Sense: Detecting Anomalies in Metrics

Modern systems produce more metrics than any single person can reason about. As systems grow and change, defining fixed thresholds becomes harder and unexpected behavior often appears without clearly crossing an alert boundary. Using a short, live walkthrough with real metric data, the talk shows how anomalies can surface gradual changes, unusual patterns and subtle shifts that are easy to miss in dashboards. The session is exploratory and practical, aimed at developers who work with metrics and want additional ways to understand system behavior without introducing complex models or heavy tooling. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/QEXDKB/ Frannz Salon Diana Todea PUBLISH 7GUTXM@@program.berlinbuzzwords.de

-7GUTXM

Reviving phonetic algorithms for better search relevance en

20260609T093000 20260609T095000 002000

Reviving phonetic algorithms for better search relevance

When users are unsure of a spelling, fuzzy search is the standard engineering solution. However, at the scale of the French National Audio-visual Institute, we found that standard fuzziness hits a wall. On a massive corpus, "approximate" matching retrieves a paralyzing amount of noise, degrading the user experience. To solve this, we looked back to move forward. We revived and re-implemented "ancient" phonetic algorithms, some dating back decades, to test if matching by sound could outperform matching by character distance. In this talk, we share our journey in tuning relevance for the French language, which is notoriously difficult due to silent letters and homophones. We will cover: - The Fuzziness Trap: Why increasing edit distance failed to solve our precision/recall trade-off. - Algorithm Showdown: A comparative analysis of standard Fuzzy Querying vs. Phonetic Analysis (e.g., Soundex, Beider-Morse, Metaphone) within our search pipeline. - Implementation: How we integrated these phonetic tokens into our indexing strategy to filter noise without losing relevant results. You will leave with a clear understanding of when to abandon standard fuzziness and how to leverage phonetic search to clean up your own noisy results. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/7GUTXM/ Kesselhaus Pietro Mele Radu Pop PUBLISH 333TRT@@program.berlinbuzzwords.de

-333TRT

From Inverted Index to Columnar Vectorized Execution Search en

20260609T100000 20260609T104000 004000

From Inverted Index to Columnar Vectorized Execution Search

Modern search workloads increasingly blend text retrieval with aggregations, vector search, and real-time analytics, pushing traditional inverted-index architectures beyond their original design. This session examines how techniques from columnar databases and high-performance analytics engines are being adopted to meet these demands. We explore three key shifts: how columnar storage improves cache locality for efficient aggregation and filtering; how SIMD and vectorized computation accelerate scoring, filtering, and similarity operations on modern CPUs; and how bulk ingestion and execution pipelines reduce coordination overhead while maximizing hardware utilization. Drawing from evolving open-source search ecosystems and real-world engineering efforts, we analyze where row-oriented execution falls short, discuss hybrid models combining inverted indexes with columnar processing, and explore treating search queries as vectorized data pipelines. Targeting developers and researchers interested in search internals, distributed systems performance, and the retrieval-analytics intersection, attendees will gain practical understanding of how hardware-aware design influences search architecture today, the trade-offs of integrating columnar and vectorized execution into retrieval systems, and where search infrastructure is heading next. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/333TRT/ Kesselhaus Ankit Jain PUBLISH M8DR9V@@program.berlinbuzzwords.de

-M8DR9V

When better retrieval makes agents worse en

20260609T111000 20260609T115000 004000

When better retrieval makes agents worse

In agentic workflows, retrieval is no longer just ranking for a human reader; it is context injection into reasoning and tool use. That shift changes the failure mode. Plausible but incorrect evidence can degrade outcomes disproportionately, and in noisy settings, longer reasoning can make answers worse rather than better. This is inverse scaling under noise: more capable reasoning produces more confident mistakes. In iterative agent loops, those mistakes are recycled and amplified, turning small retrieval defects into workflow-level failures. In this talk we'll break down the main failure modes, including plausible distractors, error compounding across steps, and the gap between traditional retrieval metrics and real task utility. We'll present design patterns for robust agentic retrieval: stricter evidence selection, sufficiency checks before acting, and explicit pause/retry/escalate behavior when confidence is not warranted. We'll also connect these patterns to challenges in open agent tooling ecosystems, where untrusted context has shown that retrieval is a threat surface as well as a ranking problem. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/M8DR9V/ Kesselhaus Lester Solbakken PUBLISH MHSTAZ@@program.berlinbuzzwords.de

-MHSTAZ

Keeping data private in real-time pipelines en

20260609T120000 20260609T124000 004000

Keeping data private in real-time pipelines

We all love real-time data — clicks, payments, rides, messages — but most of it comes with a catch: it contains personal information we’re not supposed to leak, such as names, emails, locations, or even small clues that can identify someone. The challenge: how do we keep streaming data useful and safe at the same time? In this talk, we’ll explore practical ways to protect privacy in streaming systems using Apache Kafka, Apache Flink, and Apache Iceberg. We’ll cover: - simple tricks like masking and tokenizing PII; - why “anonymous” data often isn’t anonymous (the re-identification problem); - techniques like bucketing, k-anonymity, and adding noise; - how to balance privacy with data utility (too much hiding makes data useless). Along the way, we’ll look at real-world stories: from public data leaks to surprising deanonymization attacks, and show live demos of pipelines that anonymize data before it’s written to storage. If you’ve ever wondered how to build privacy-aware pipelines, this talk will give you practical patterns you can use right away. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/MHSTAZ/ Kesselhaus Olena Kutsenko PUBLISH QJBKUU@@program.berlinbuzzwords.de

-QJBKUU

The Three-Body Problem of Inverse Hybrid Search en

20260609T140000 20260609T144000 004000

The Three-Body Problem of Inverse Hybrid Search

Saved searches and alerts are common across e-commerce and marketplaces: price drops, availability notifications, and increasingly, visual alerts driven by images captured on mobile devices. While the user experience feels simple, the underlying system represents one of the most demanding forms of search. This talk reframes alerting as a distinct retrieval discipline: - **Inverse**: documents trigger queries, not the other way around - **Hybrid**: vector similarity, boolean filters, and lexical constraints must all apply - **Fetch-All**: every true match must be returned - no truncation, no approximation We examine why traditional search assumptions fail under these constraints. In particular, we show how cost and instability are driven not by throughput (QPS), but by match cardinality - the number of alerts matched per incoming item - and how this interacts with scatter/gather execution, merge costs, and bursty ingestion patterns. The talk focuses on: - where inverse hybrid systems break silently - why scaling infrastructure buys stability rather than throughput - how correctness becomes an operational and economic concern - why AI-driven recall often increases system pressure rather than reducing it Attendees will leave with a concrete framework for reasoning about inverse hybrid search systems at scale. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/QJBKUU/ Kesselhaus Ravindra Harige PUBLISH ZY3Y9U@@program.berlinbuzzwords.de

-ZY3Y9U

Beyond Grep: Search for Reliable Coding Agents en

20260609T145000 20260609T153000 004000

Beyond Grep: Search for Reliable Coding Agents

Coding agents work well partly because software is a verifiable domain: compilers, tests, and static checks create tight feedback loops that support iterative improvement. Yet even with better tooling, MCP integrations, and skills-based workflows, many agents still degrade in large codebases where retrieval quality becomes the limiting factor. This talk explores a working hypothesis: improving search is one of the highest-leverage ways to improve coding-agent outcomes before changing model size. We will examine retrieval patterns across keyword, structural, and hybrid lexical-semantic pipelines, and discuss where each approach may help or fail. Attendees will see how indexing, relevance tuning, and retrieval evaluation reduce token waste, answer quality, and provide stable foundations for agentic systems. A live demo shows search in action, highlighting how it complements AI rather than being replaced by it. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/ZY3Y9U/ Kesselhaus Amine GANI Roudy Khoury PUBLISH NBBST7@@program.berlinbuzzwords.de

-NBBST7

Correctness Too Cheap To Meter: Formal Verification and LLMs en

20260609T160000 20260609T164000 004000

Correctness Too Cheap To Meter: Formal Verification and LLMs

Formal methods can mathematically prove certain properties of software: for example, we can guarantee a database is deadlock free or avoids crashes. Major infrastructure providers like AWS and Azure all leverage verification, but it's currently too expensive and time-consuming to deploy for most use-cases. However, LLMs can automate much of this toil. This talk demonstrates how we can scale formal methods from an academic luxury to a tractable tool. We share novel research on applying LLMs to formalize real-world systems, including popular DBs and libraries. We present benchmark results, our automated formal spec generation framework, and current model shortcomings. In particular, we'll touch on: - What formal verification is, why it's key for critical systems, and how it's typically done - SysMoBench: an LLM benchmark grounded in practical formal verification metrics instead of toy tasks - Specula: an automated framework to synthesize formal specifications directly from source code, eliminating tedious dev work - New, unpublished research on connecting specs to real source code more efficiently Our approach decreases the implementation cost of formal methods, enabling industry to more efficiently avoid outages and bugs. Audience members will take away knowledge of what formal methods are and how to effectively deploy them by taking advantage of automation opportunities. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/NBBST7/ Kesselhaus Emilie Ma PUBLISH FXSUSJ@@program.berlinbuzzwords.de

-FXSUSJ

From Legacy Search to Vespa: What a Real PoC Taught Us en

20260609T165000 20260609T171000 002000

From Legacy Search to Vespa: What a Real PoC Taught Us

For a long time, our homepage recommendations were driven by a search-first relevance approach. It was fast to iterate on and easy to reason about, but it limited personalization and proved fragile as soon as listings lacked structure or consistency. In this talk, we describe how we transitioned to a Vespa-based recommendation stack, starting with the Motors category, where structured attributes are comparatively rich, and gradually expanding to less-structured categories. Rather than a big-bang rewrite, we incrementally replaced the legacy system. We’ll share what the PoC taught us in practice: how we ran old and new systems in parallel, defined guardrails for quality and stability, and progressively improved signals by introducing text embeddings for listings and searches, extracting attributes from free text, and incorporating signals derived from images. We’ll also cover what didn’t work as expected, which assumptions broke under real traffic, and how evaluation and rollout influenced the final architecture. Attendees will leave with concrete lessons on migrating relevance systems in production, running PoCs that expose real constraints, and introducing modern retrieval and ranking approaches when your data foundations are anything but perfect. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/FXSUSJ/ Kesselhaus André Charton Valeriia Platonova PUBLISH KW73Y8@@program.berlinbuzzwords.de

-KW73Y8

Closing Session en

20260609T171000 20260609T172000 001000

Closing Session

- PUBLIC CONFIRMED #BBuzz https://program.berlinbuzzwords.de/bbuzz26/talk/KW73Y8/ Kesselhaus Paul Berschick PUBLISH HWPQ7L@@program.berlinbuzzwords.de

-HWPQ7L

Circular Dependency Fixes when Bootstrapping a Golden Set en

20260609T093000 20260609T095000 002000

Circular Dependency Fixes when Bootstrapping a Golden Set

If you’re not satisfied with your golden set or don’t have it at all, this session is for you. You may have queries (e.g., from query logs) or you need to generate them. We’ll start by looking at how to create synthetic queries from individual documents, as well as from facets and facet combinations, that might match N documents. We’ll move on to relevance judgements. Even with LLM-as-a-judge, it’s not feasible to, say, rate a 1M doc corpus for 1K queries. We need the top N. How do we know the "correct" top N? We’ll need to explore the dataset for any query that is ambiguous (i.e., doesn't clearly match a single doc). There are different methods for exploring data: visualizations, analysis tweaks (e.g., stemming, synonyms)... Vector similarity also helps, but choosing an embedder is tricky because transfer learning can introduce bias that may be misleading for our dataset. We can’t get a perfect golden set on the first try, but we’ll explore techniques to iterate until we’re happy. Which is important for any new search application, whether it’s central to the business (i.e., larger teams, bigger budget) or not. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/HWPQ7L/ Maschinenhaus Radu Gheorghe Rafał Kuć PUBLISH GWVGQP@@program.berlinbuzzwords.de

-GWVGQP

Text-to-Struct: Fine-tuning SLMs for Query Intent en

20260609T100000 20260609T104000 004000

Text-to-Struct: Fine-tuning SLMs for Query Intent

Building a search experience that feels "intelligent" requires more than just embedding user input or matching keywords. Real-world financial queries—whether from an analyst or a "lazy" Agentic LLM—are rarely optimized for your index. They are a messy mix of semantic intent ("tech stocks sensitive to rate hikes") and rigid constraints that simple hybrid search often ignores. We typically see three "Intent Killers" in production: * **Time:** "European bank guidance *last quarter*" (Vector search ignores recency; Keywords miss the fiscal calendar). * **Entities & Content Types:** "CEO remarks on AI in *10-K risk factors*" (Often conflated with general news or 10-Q tables). * **Ambiguity:** Generic LLMs often spam search APIs with broad, unrefined queries like "crypto regulation risks" that return noise instead of specific regulatory filings. In this session, we present a robust approach: **Fine-tuning a Small Language Model (SLM) to act as a dedicated "Query Understanding" layer.** We will move beyond simple RAG architectures and demonstrate how to train a small, deterministic model to parse raw text and output a valid **Structured Semantic Query**. The training dataset for this is created/prepared by combining real user queries with synthetic data, and we used an LLM to assist in the initial annotation (a form of knowledge distillation) which was then meticulously reviewed to ensure the model captures the necessary constraints and financial nuance. This shifts the burden of "knowing how to search" from the user to the system. **We will cover:** * **The "Hybrid Gap":** Why combining Semantic \+ Lexical search is not enough. We will analyze failure cases involving strict fiscal periods, specific tickers (e.g., distinguishing "META" the company from "meta" the prefix), and document sub-types. * **The "LLM as User" Problem:** How to handle the influx of queries from generic LLM Agents. We show how to translate their broad requests (e.g., "Give me macro trends") into the specific, optimized queries your engine actually needs. * **Why Not Just Prompt a Giant Model?** We demonstrate why "Prompt Engineering" generic LLMs is a dead end for high-performance finance search. We show how generalist models lack the necessary domain expertise to ensure schema adherence, and compare the latency/cost against specialized SLMs that offer 99% schema adherence * **Query Expansion & Intent Routing** is a process where a fine-tuned Small Language Model (SLM) intercepts the user's initial search phrase and automatically enriches it with specific, structured search terms before sending it to the index. Instead of just matching keywords, the SLM *translates* the user's semantic intent into precise, optimized queries. For instance, a vague term like "greenwashing" is expanded and routed as multiple concepts, such as `regulatory_risk` OR `esg_controversy`. * **Impact on Relevance:** Real-world comparisons showing how "translating" intent upstream drastically improves retrieval quality for complex financial instruments compared to standard Hybrid Search. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/GWVGQP/ Maschinenhaus Hugo Jimenez Sandra Bullón PUBLISH WPX33K@@program.berlinbuzzwords.de

-WPX33K

Context-Aware Segments: Solving the "Scatter-Read" Problem en

20260609T111000 20260609T115000 004000

Context-Aware Segments: Solving the "Scatter-Read" Problem

#### The Friction: The "Everything, Everywhere" Problem In distributed search engines like OpenSearch, the Shard is the unit of scale, but the Segment is the unit of storage. Traditionally, documents are written to segments based purely on arrival time. For multi-tenant SaaS platforms or high-velocity observability clusters, this means data for a specific tenant or time-range is scattered across every single segment within a shard. A simple filter query becomes an expensive fan-out operation, thrashing the file system cache and wasting CPU cycles checking documents that will never match. #### The Solution: Context-Aware Segments (CAS) This session dissects the design and implementation of CAS (OpenSearch RFC #18576) and its foundation in Lucene (Issue #13387). This architectural shift introduces a logical "context" dimension to segment creation. Instead of a temporal log, we treat segments as optimized containers for specific data subsets. In this technical deep-dive, we will cover: **Granular Segment Pruning**: How the query coordinator leverages new segment-level metadata to perform "pre-search" filtering—effectively skipping files on disk before the engine even opens them. **Vector Segment Pruning**: How we use segment-level metadata to skip entire HNSW graphs during a k-NN search. If a segment doesn't contain "Tenant A," we don't even load its vector blobs into memory. **Supercharged Compression**: We demonstrate how grouping similar data by context significantly increases compression ratios. When the storage engine sees repetitive data patterns in a single segment, the bit-packing and dictionary compression become far more efficient, slashing the data footprint. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/WPX33K/ Maschinenhaus Rishav Sagar Tejas Shah PUBLISH 9NH7VB@@program.berlinbuzzwords.de

-9NH7VB

C++ Search for Database Kernels: Built In, Not Bolted On en

20260609T120000 20260609T124000 004000

C++ Search for Database Kernels: Built In, Not Bolted On

There's a certain irony in building a search engine no one can find. IResearch is an open-source Apache 2.0 C++ search library that has been quietly powering search inside databases since 2015: first behind ArangoSearch, now as the foundation of SereneDB. Instead of becoming yet another standalone search server, it evolved into a library designed to be embedded directly into database kernels. That journey defined most of the architectural decisions in the project. Some of them were good, some painful. This talk tells the honest story of it. We'll start with how IResearch ended up inside databases and what that means in practice: WAL integration, transactional consistency of search indexes, synchronization with the main storage. From there, we'll compare IResearch functionally and architecturally to Lucene and Tantivy. All three Lucene-inspired engines diverge in different aspects: functionality, index layout and scoring. Using the search-benchmark-game suite, we'll put them head to head - not to declare a winner, but to dissect why the numbers look the way they do and trace performance differences back to architectural roots. Then we will talk about how different search engines handle scoring and a very noticeable difference of IResearch in that regard. When documents are scored one at a time, significant CPU throughput is left on the table. While recent Lucene versions have begun moving toward block-based evaluation, scoring remains far from fully vectorized. However, some newer approaches treat relevance computation as a SIMD-friendly evaluation pipeline similar to query execution engines. We'll walk through how this could work in practice and show the concrete throughput gains it delivers. Along the way, we'll be honest about the mistakes we made: architectural bets that didn't pay off, abstractions that hurt performance in production, integration patterns we had to rip out and rebuild. Finally, we'll zoom out to the architectural questions that emerge when search lives inside a database. How do you scale a search index when compute and storage are separated? How does search fit into OLAP query execution: late materialization, joins between search indexes and analytical data, unified query planning? We'll share what we've learned solving these in practice. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/9NH7VB/ Maschinenhaus Andrey Abramov PUBLISH GPKCWA@@program.berlinbuzzwords.de

-GPKCWA

One GPU, Four Retrieval Modes: Multi-Model Search Serving en

20260609T140000 20260609T144000 004000

One GPU, Four Retrieval Modes: Multi-Model Search Serving

Every production search system in 2026 runs multiple models. A dense embedder handles semantic search. A sparse model provides keyword recall. A multi-vector model like ColBERT enables token-level matching. A cross-encoder reranker improves final precision. These four stages have become table stakes for competitive retrieval quality. The infrastructure story is less elegant. The industry default is one container per model, typically using HuggingFace TEI, Triton, or a custom Flask wrapper. Four models means four separate deployments, four sets of scaling rules, and four GPU allocations where each model uses a fraction of what it reserves. When building SIE, an open-source search inference engine, we took a different approach: one server process that handles all four retrieval modes through a unified API with three primitives (encode, score, extract). Models like BGE-M3 return dense, sparse, and multi-vector outputs from a single encode call. Cross-encoder reranking uses the score primitive. Same server, same GPU, same API. The talk covers four areas. First, why hybrid retrieval requires multiple model types. We will walk through a real retrieval pipeline: sparse for keyword recall, dense for semantic matching, ColBERT for token-level precision, and a cross-encoder for final reranking. For each stage we will show what it adds to retrieval quality using BEIR benchmark data, and when the added complexity is not worth it. Second, the adapter architecture that makes multi-model serving possible. SIE wraps PyTorch, FlashAttention, SentenceTransformers, and SGLang behind a common interface. We will walk through the lifecycle of a request: API call, tokenization on CPU, batching, GPU inference, and postprocessing. Different model architectures need different compute backends, and we will explain why a single unified runtime was not the right choice. Third, building the pipeline end to end. A practical walkthrough of dense + sparse + ColBERT + reranking from a single server instance, including how to combine scores from different retrieval modes and how to tune the balance between recall and precision. Fourth, tradeoffs and lessons. When does multi-model serving on one GPU work well, and when should a model get its own dedicated container? What happens under concurrent load when multiple models compete for memory? We will share real data from running these workloads on L4 GPUs. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/GPKCWA/ Maschinenhaus Filip Makraduli PUBLISH TUSDT8@@program.berlinbuzzwords.de

-TUSDT8

Zero downtime index upgrade in Apache Solr en

20260609T145000 20260609T153000 004000

Zero downtime index upgrade in Apache Solr

Starting Lucene/Solr 7.x, if you have an index created in a certain version it is only usable until the next major version upgrade. Beyond that you are required to recreate the index from source data. This can be a practical challenge in case of large clusters since this can imply potential downtime and/or significant infrastructure & operational costs, or worse - a dead-end if the true source of data no longer exists. Starting Apache Solr 9.11 [yet to be released as of this draft], users have the ability to upgrade an index in-place with zero downtime, subject to certain constraints . This prepares the index for a future Solr major version upgrade, eliminating the need to recreate the index from source. The implication is that an index originally created in Solr 8.x now has a pathway to future upgrades without needing the source data. In the talk, I’ll discuss the mechanisms and APIs that Solr exposes to support this capability, and the constraints involved. The audience will also learn about implementation details across the Lucene and Solr layers, and how the underlying changes made during this effort pave the way for a similar capability in other Lucene-based search engines. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/TUSDT8/ Maschinenhaus Rahul Goswami PUBLISH EDXRLN@@program.berlinbuzzwords.de

-EDXRLN

Building Schema-Free Applications with RDF en

20260609T160000 20260609T164000 004000

Building Schema-Free Applications with RDF

Most applications assume their data model is known before the first user interacts with the system. But there are cases where this assumption doesn't hold, and the structure of the data needs to emerge from how people use the system rather than being designed upfront. This talk explores why common database paradigms fall short for this use case and how that search led us to Resource Description Framework (RDF). Originally designed for the semantic web, RDF stores knowledge as subject-predicate-object triples, a surprisingly natural fit for application data when the schema isn't fixed. We cover the practical side: using fine-tuned open source models to translate natural language into SPARQL queries, drawing on research like FIRESPARQL, storing data with tools like Oxigraph, and self-hosting models with our open source model serving platform Paddler (https://github.com/intentee/paddler). Finally, we show how LLMs can derive not just what users explicitly say but also implicit relationships, opening new possibilities for analytics and knowledge discovery. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/EDXRLN/ Maschinenhaus Gosia Zagajewska Mateusz Charytoniuk PUBLISH 7ND3UE@@program.berlinbuzzwords.de

-7ND3UE

How to Survive the Vortex of LLM Change en

20260609T165000 20260609T171000 002000

How to Survive the Vortex of LLM Change

Working with LLMs today means operating in an environment where models, APIs, capabilities, and costs change constantly. What works today may become obsolete in months, creating technical and organizational pressure on teams. In this talk, we share our experience working in an intelligent search company in this environment. We will share the good, the bad, and the ugly: the rollercoaster of realizing that something which took hours of code can suddenly be achieved with a simple prompt. We discuss how we evaluate new models without destabilizing production, stay updated without losing our minds, and separate the wheat from the chaff in the constant stream of LLM news. Beyond technical architecture, we reflect on the human side of constant change. The goal is not to predict where LLMs will go next, but to share strategies for building systems and teams that adapt without losing sanity. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/7ND3UE/ Maschinenhaus Carmen Iniesta Carles Onielfa PUBLISH AJK8EK@@program.berlinbuzzwords.de

-AJK8EK

Kafi Streams: Complex Stream Processing Made Simple en

20260609T100000 20260609T104000 004000

Kafi Streams: Complex Stream Processing Made Simple

I will unveil Kafi Streams, an Open Source library for complex stream processing inspired by Kafka Streams but built on top of PyDBSP, a pure Python implementation of Feldera's novel "Database Stream Processing" theory. Why would we need yet another stream processing library? One whose name sounds so strikingly similar to the most popular stream processing library on the planet? Because existing stream processing libraries are too complex, even Kafka Streams. Their engines have been prematurely optimized for maximum scale, not simplicity. You cannot easily do stream processing without understanding concepts like streams vs. tables, co-partitioning, windowing (hopping, tumbing, sessions...), state stores etc. - all these "leaky abstractions" still prevailing in the stream processing world. It is them that keep stream processing in a niche. On the contrary, Kafi Streams aims at making stream processing simple. It does not (yet) aim for extreme scale and performance. But to enable complex stream processing with full support for joins, aggregations et al. for the less performance-heavy 80% of use cases in non-tech companies like mine, Migros, a $30B+ revenue retailer. With Kafi Streams, anyone can do complex stream processing, even those who have never done it before. Right from the start. Because with DBSP as our basis, streaming is no different from batch any longer. Simple. Deterministic. Not just eventually but strongly consistent. Just like anybody coming from outside the streaming world would always have hoped. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/AJK8EK/ Palais Atelier Ralph Matthias Debusmann PUBLISH 7EVE78@@program.berlinbuzzwords.de

-7EVE78

DuckDB beyond the notebook en

20260609T111000 20260609T115000 004000

DuckDB beyond the notebook

SQLite as an embedded database is known by everyone — easy to integrate into any application, it's the most widely used database in the world. DuckDB is also an embedded database, but with a focus on analytical queries rather than transactional workloads. Today, DuckDB has evolved into a blazingly fast query engine that runs almost everywhere. This opens up new architectural possibilities for building data-driven applications — especially when analytics need to be delivered directly to end users through interactive reports, dashboards, or exploratory tools. In this talk, I'll use minimal slides and plenty of live demos to show how developers can build fast and lean data applications with DuckDB. We'll explore scenarios including browser-based analytics powered by WebAssembly, serverless functions processing data from cloud storage, and embedded analytics in traditional applications. We'll also examine the architectural implications: how embedded OLAP changes data architectures by bringing compute closer to the data, enabling 1.5-tier and cache-layer patterns that eliminate the need for separate analytics infrastructure. Attendees will learn what DuckDB is and how it works, how it differs from other embedded databases, and how to use it to build data-driven applications that go well beyond the notebook. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/7EVE78/ Palais Atelier Matthias Niehoff PUBLISH MYQSXK@@program.berlinbuzzwords.de

-MYQSXK

OTel + Apache Iceberg: The New Standard for Observability en

20260609T120000 20260609T124000 004000

OTel + Apache Iceberg: The New Standard for Observability

Observability is shifting from vendor-specific stacks to an open, composable architecture. This talk presents a reference design where OpenTelemetry provides collection, context propagation, and semantic normalization, and Apache Iceberg becomes the open data layer for logs, metrics, and traces. We will explain why this pairing is emerging as a practical standard for portability and governance, and why it fits agent-driven investigation workflows. The focus is on production write-path realities: schema drift, high cardinality, small-file control, commit and compaction strategy, and streaming aggregation patterns that keep latency and cost predictable. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/MYQSXK/ Palais Atelier Yingjun Wu PUBLISH A3JKMH@@program.berlinbuzzwords.de

-A3JKMH

What If We've Been Scaling Stream Processing Wrong All Along en

20260609T140000 20260609T144000 004000

What If We've Been Scaling Stream Processing Wrong All Along

Your Kafka Streams application just rebalanced. Again. Your Flink checkpoint is timing out. Again. Here's an uncomfortable truth: most stream processing applications don't operate at Uber scale. They handle thousands of events per second—complex joins, stateful aggregations, valid use cases - but nowhere near the volumes that justify the operational complexity we've accepted as normal. Yet we pay the full distributed systems tax anyway. Repartition topics doubling network I/O and storage. Repeated serialization burning CPU cycles, often accounting for a significant amount of the total compute of an application. Standby replicas sitting idle. State migration or restoration during deployments. And the human cost: specialized expertise that takes years to develop, expert teams that are expensive to build and painful to lose. We've normalized extraordinary inefficiency in the name of horizontal scalability that many applications will never need. But rethinking stream processing in 2026 doesn't mean "just use Postgres." In this talk, I'll share an early-stage exploration of a different approach. A framework that preserves the Kafka Streams DSL, borrows Flink's approach to exactly-once semantics, leverages Project Loom for high concurrency—and challenges a fundamental assumption that both frameworks share. This is an invitation to question conventional wisdom and explore what stream processing could look like when we stop distributing by default. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/A3JKMH/ Palais Atelier Hartmut Armbruster PUBLISH GH8HEH@@program.berlinbuzzwords.de

-GH8HEH

Detecting Hidden Bias in Datasets Before Models Fail en

20260609T145000 20260609T153000 004000

Detecting Hidden Bias in Datasets Before Models Fail

Machine learning models rarely fail because of algorithms — they fail because of data. This talk focuses on practical techniques for detecting hidden bias in datasets before models reach production. Drawing from real-world ML systems, it covers how regional, temporal, and behavioral imbalances distort model behavior while remaining invisible to standard metrics. Attendees will learn how to identify distribution drift, uncover feature leakage, and detect coverage gaps across segments and time windows. The session demonstrates concrete workflows, diagnostics, and visualizations that can be applied using open-source tools to improve data quality, model reliability, and long-term trust in ML-driven products. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/GH8HEH/ Palais Atelier Stas Don PUBLISH UW9W9C@@program.berlinbuzzwords.de

-UW9W9C

What you should know about constraints in PostgreSQL 18 en

20260609T160000 20260609T164000 004000

What you should know about constraints in PostgreSQL 18

PostgreSQL 18 introduces significant enhancements to constraints, the first line of defense for maintaining data integrity. This talk focuses on new capabilities in version 18, including non-overlapping PRIMARY KEY, UNIQUE, and foreign key constraints; NOT NULL constraints becoming first-class citizens; the introduction of NOT ENFORCED constraints and improved support for partitioned tables. We’ll look at what’s new, why it matters and how to apply these features in real-world systems. We’ll begin with a detailed walkthrough of the pg_constraint catalog, covering less commonly discussed concepts such as constraint deferrability, constraint triggers, domains and related internals. From there, we’ll move on to what’s new in PostgreSQL 18. A major addition is temporal keys, bringing PostgreSQL a step closer to supporting temporal data models. Another key change is NOT NULL becoming a standard constraint along with the implications of that promotion. We’ll also explore NOT ENFORCED constraints and other recent additions and briefly look ahead to what’s coming in PostgreSQL 19. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/UW9W9C/ Palais Atelier Gülçin Yıldırım Jelinek PUBLISH DLYAP8@@program.berlinbuzzwords.de

-DLYAP8

AI in the physical world: from observation to discovery en

20260609T165000 20260609T171000 002000

AI in the physical world: from observation to discovery

Artificial intelligence is moving beyond text generation and digital optimization into domains where uncertainty, scale, and scientific rigor dominate. Modern physics provides a uniquely demanding testbed for this shift: deep learning is used to reconstruct complex events, identify rare phenomena, and search for anomalies in high-dimensional datasets characterized by sparse signals and strict statistical constraints. At the same time, AI is taking on more structured roles in scientific workflows — from code generation and literature synthesis to emerging agent-based approaches — raising fundamental questions about how far AI can support scientific reasoning in practice. A central challenge is operational integration. AI methods are increasingly explored for decision support in complex research facilities: tuning accelerator parameters, assisting telescope operations, and adapting to evolving environmental and hardware conditions. Yet claims of autonomous discovery or fully AI-driven infrastructure have often proven difficult to reproduce outside controlled settings. A balanced, engineering-focused assessment of both successes and limitations is therefore essential. In this talk, I will survey real-world applications of AI across modern physics, from collider experiments to large-scale astronomy systems. I will highlight measurable gains alongside negative results, sources of bias, and stability issues that matter for production environments. The presentation concludes with a concrete case study from gamma-ray astrophysics, illustrating both the opportunities and the practical limits of integrating AI into data analysis pipelines and next-generation observatory infrastructure. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/DLYAP8/ Palais Atelier Dmitriy Kostunin Julian von Hoerschelmann-Schliwinski PUBLISH ZWUDWR@@program.berlinbuzzwords.de

-ZWUDWR

Writes, 3 ways: Postgres, Apache Kafka® and Apache Iceberg™ en

20260609T093000 20260609T095000 002000

Writes, 3 ways: Postgres, Apache Kafka® and Apache Iceberg™

The world of data services is evolving rapidly, with adoption of open table formats like Apache Iceberg™ picking up steam quickly. But “data services” is a pretty broad category, and none of these services is quite like the other. In this talk we’ll take a step back to look at three data services: Postgres, Apache Kafka and Apache Iceberg, and how they each handle writes. In doing so, we’ll trace a history through how data services have evolved in the world of distributed systems and big data. We’ll understand the key differences and similarities between these services. Finally, we’ll take a look at what’s coming next in the world of open source data, from Postgres and beyond. This session is meant as a refresher for existing data engineers as well as a primer for junior engineers: Most developers know a bit about Postgres but they might not fully understand the internals, and many engineers are getting heavily involved in Iceberg, but might not understand why it's relevant. PUBLIC CONFIRMED Short Talk https://program.berlinbuzzwords.de/bbuzz26/talk/ZWUDWR/ Frannz Salon Celeste Horgan PUBLISH XQHAM9@@program.berlinbuzzwords.de

-XQHAM9

GitOps for n8n: Treating Workflows as Code en

20260609T100000 20260609T104000 004000

GitOps for n8n: Treating Workflows as Code

Automation workflows frequently run critical business logic, yet they are often excluded from the same operational discipline applied to application code. In many teams, n8n workflows are created visually, copied between environments, and modified directly in production, making changes hard to audit, review, or roll back. This talk presents n8n-gitops, an open-source project that explores how GitOps principles can be applied to n8n without changing how workflows are authored. The session starts by framing the problem: why manual promotion of workflows, UI-driven deployments, and inline code create operational risk as systems grow. From there, we introduce the core ideas behind n8n-gitops: treating Git as the single source of truth, exporting workflows in mirror mode, externalizing code for proper review, and deploying deterministically from Git references. A live demonstration will show: - Exporting workflows from n8n into a Git repository in mirror mode - Externalizing Python and JavaScript code into first-class files - Reviewing workflow changes using normal Git diffs and pull requests - Deploying workflows from a specific Git tag or commit - Rolling back safely by redeploying a previous Git reference The second half of the talk focuses on lessons learned: - What GitOps brings to automation platforms - Why credentials are intentionally excluded from full automation - Trade-offs compared to UI-driven or enterprise Git integrations When this approach improves safety—and when it adds unnecessary friction This is not a product presentation, but an experience report on extending GitOps beyond infrastructure into workflow engines, aimed at engineers working with operations, automation, and platform tooling who want stronger guarantees around change, traceability, and deployment. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/XQHAM9/ Frannz Salon Joao Gilberto Magalhaes PUBLISH CJRUW3@@program.berlinbuzzwords.de

-CJRUW3

Real-Time ML Pipelines: Feature Chaining with Chronon en

20260609T111000 20260609T115000 004000

Real-Time ML Pipelines: Feature Chaining with Chronon

Traditional feature engineering pipelines force teams to choose between freshness and latency, leading to complex dual architectures that are expensive to maintain and prone to training-serving skew. For search and recommendation systems, this trade-off is particularly painful: you need a blend of fresh signals (user, query, and item features) and their corresponding embeddings for retrieval and ranking, but can't sacrifice the sub-100ms latencies these systems need to meet. This talk explores how [Chronon](https://chronon.ai) solves this challenge through a unified abstraction over batch and streaming computation, allowing teams to define features once and serve them with minimal latency while keeping them updated in near real-time. Chronon has been battle-tested in production at companies like Stripe, Airbnb, Netflix, and OpenAI, serving billions of predictions daily. We'll use a two-tower search retrieval and ranking pipeline as our primary case study, walking through: * Computing real-time user and item embeddings for candidate retrieval * Chaining embedding computation with tabular features to power ranking models * Minimizing computation in the serving hot-path reducing infrastructure costs by orders of magnitude Audience takeways: * How Chronon unifies batch and streaming feature computation * Chronon's pluggable architecture with respect to table formats, streaming buses, KV stores and model platforms * Chronon's approach to minimize serving latency while maximizing feature freshness in production ML systems * How one can build ML pipelines that chain feature computation with model inference / embedding * Real-world lessons from companies serving billions of predictions daily This talk sits at the intersection of search, data streaming, and AI in production—ideal for ML engineers, search platform teams, and anyone building real-time intelligent applications at scale. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/CJRUW3/ Frannz Salon Varant Zanoyan PUBLISH HUWSBR@@program.berlinbuzzwords.de

-HUWSBR

The Failures That Don't Crash: MLOps for AI Agents en

20260609T120000 20260609T124000 004000

The Failures That Don't Crash: MLOps for AI Agents

AI agents are shipping to production without the reliability patterns we spent decades building for distributed systems. Only 37% of teams run online evaluations on their agents (LangChain State of Agent Engineering 2026). The rest have no systematic way to detect when an agent produces a confident, plausible, wrong answer. This talk bridges that gap. Drawing from 15 years of building systems at scale (50 billion requests/month at Start.io, shadow deployment pipelines at Riskmethods, and the core MLOps platform at Qwak) I'll present four reliability patterns adapted for agent architectures: 1. Shadow testing agents against a baseline before promoting them to production 2. Circuit breakers with confidence thresholds instead of simple error rates 3. Evaluation harnesses designed for non-deterministic outputs 4. Structured human oversight that accounts for automation bias decay Each pattern comes with implementation details: what to measure, where to hook into the agent lifecycle, and what failure modes to watch for. The examples are framework-agnostic and based on real production systems, not toy demos. The audience will walk away with concrete patterns they can apply to their own agent deployments whether they're building with LangChain, LlamaIndex, custom frameworks, or bare API calls. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/HUWSBR/ Frannz Salon Bartosz Mikulski PUBLISH H9UL7Y@@program.berlinbuzzwords.de

-H9UL7Y

How to Tell If Your Agent Used the Right Stuff en

20260609T140000 20260609T144000 004000

How to Tell If Your Agent Used the Right Stuff

Your agent answered confidently, did it use the right evidence? We’ll walk through a repeatable debugging workflow for RAG + tool-using agents: instrument traces, inspect retrieved chunks, run attribution and citation checks, and isolate failure modes (missing recall, bad ranking, distractors, stale docs). You’ll learn how to create a lightweight golden set, write probe questions, and track retrieval + answer metrics so improvements are measurable, not vibes. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/H9UL7Y/ Frannz Salon Apurva Misra PUBLISH G3RFDB@@program.berlinbuzzwords.de

-G3RFDB

Sunset for the Wild West: Making ML disciplined by default en

20260609T145000 20260609T153000 004000

Sunset for the Wild West: Making ML disciplined by default

At first glance, MLOps teams have an unenviable challenge, since they exist to bridge the gap between machine learning practitioners and infrastructure engineers, who work at opposite ends of the application stack and have distinct vocabularies, skills, and goals. Practitioners often adopt an anything-goes creative approach and figure out _why_ a technique works after it's already getting results; this culture has led to many advances in applied machine learning but can be in tension with building reliable systems. However, there's a surprising commonality between ML practitioners and infrastructure teams, and their concerns may not be as different as they appear. Infrastructure engineers care about security, observability, and predictable utilization while ML practitioners care about reproducibility, understandability, and performance. This session will argue that the diverse concerns of these groups are often manifestations of the same underlying systems challenges, and that the same open-source tools can help both audiences address their pain points. We'll draw on our experience helping researchers get experiments into production at scale and helping infrastructure teams deploy and manage enormous clusters. Most importantly, we did this while meeting practitioners where they are: without requiring researchers to become release engineers or demanding that SRE teams start caring about gradients or manifolds. You'll come away from this talk with concrete tools and playbooks to make machine learning systems safer and more predictable, to eliminate the error-prone manual work of getting code from an experimental environment ready for collaboration or production, to help researchers achieve reproducible results, to better understand the software your team wants to run and the infrastructure that supports it, to balance overhead and observability for demanding workloads, and to ensure that you know at a glance what's actually running on your compute clusters — from project-specific Kubernetes configurations all the way down to device drivers and everything in between. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/G3RFDB/ Frannz Salon William Benton PUBLISH R37LPK@@program.berlinbuzzwords.de

-R37LPK

Escaping the Cloud: High-Performance AI in your Browser en

20260609T160000 20260609T164000 004000

Escaping the Cloud: High-Performance AI in your Browser

Server-side inference is the bottleneck of modern AI. It introduces network latency, creates massive operational costs, and forces complex privacy compliance. But what if we could push the compute entirely to the edge, specifically, the browser tab? This session explores the architecture of **Client-Side AI**, where the strategy is to distribute the workload to the user's own hardware. We will investigate the modern browser-based ML stack: - The Runtime: How **ONNX Runtime** provides a near-native execution environment for models trained in PyTorch or TensorFlow. - The Hardware Access: Leveraging **WebGPU** to unlock direct access to the client’s GPU, bypassing the limitations of legacy WebGL. - The Pipeline: A technical look at optimizing transformer models (quantization, caching) for delivery over the wire using libraries like **Transformers.js**. But most of all, we will look at actual demos of LLMs, speech and computer vision models all running in the browser. We’ll be honest about the trade-offs: memory limits, model size constraints, and the reality of browser compatibility in 2026. Join us to see if the future of AI scaling is actually... no servers at all. PUBLIC CONFIRMED Talk https://program.berlinbuzzwords.de/bbuzz26/talk/R37LPK/ Frannz Salon Johannes Kolbe