Berlin Buzzwords 2025

Cross Domain Enterprise Search - Content Diversity at Scale
2025-06-17 , Maschinenhaus

This talk will focus on learnings gathered when building an enterprise search platform with multi modal content - ranging from highly domain specific content to images to unstructured content. Problems of extraction, inference and relevance shall be discussed, while showcasing cross domain search at scale.


Cross domain search is a long lasting problem -- from the challenges of ingesting variety of data with different structures, content, noise and extraction strategies to generating multiple ground truth golden data sets to benchmark individual corpus' relevance. Coupled with the challenge of cross domain relevance across multi modal content, with no defined mechanism to normalise scores across individual queries across different content, to the challenges of domain specific terminology, to the challenges of cross modal embedding generators and language specific challenges, the list goes on.

This talk will focus on learnings of building an enterprise search system, which literally deals with more than 10 different types of content at the same time, and scales into billions of documents. Attendees can expect to learn novel techniques involved in cross domain ranking, content curation, content extraction and natural language query processing.


Tags:

Search, Scale, Stories

Level:

Intermediate

Distributed systems and information retrieval guy

I turn complex engineering challenges into scalable, high-impact solutions—while building rockstar teams along the way. With a deep expertise in search, recommender systems, and distributed systems, I thrive at the intersection of machine learning, engineering, and business growth.
If you like to geek out over AI, search, or the magic of data-driven decisions, let’s connect!