Berlin Buzzwords 2023

Following his passion he entered the Apache Lucene and Solr world in 2010 becoming an active member of the community and Apache Lucene/Solr Committer and PMC member.
Experience with a great variety of clients has taught him to be a proficient and professional consultant.
Recently Alessandro has contributed Neural Search to Apache Solr and worked on integrating Apache Solr’s Learning To Rank in various company ecosystems with the aim of improving search result relevancy.
Prior to that he designed and developed an enterprise semantic search engine known as Sensefy using approaches such as Named Entity Recognition at indexing time, advanced autocompletion, and document similarity metrics.

Introducing Multi-valued Vector Fields in Apache Lucene

Aline Paponaud

CTO of Adelean, working with search and providing consulting services and expertise around Elasticsearch, Lucene and Solr. She brings her energy to leveraging search engines, as they become more and more essential in every domain.

Towards a decentralized and collaborative search engine

Anna Ruggero

Anna has demonstrated a passion for Information Retrieval since the University. Graduated from the University of Padua, with a computer science master’s degree dissertation in Entity Search, Anna has been working as a Search Consultant in Sease since 2019.
She actively works to support clients in the process of improving their search engines with the implementation of innovative personalized solutions.
She specializes in the integration of machine learning techniques with information retrieval systems, from Learning to Rank techniques to Neural Searches and Recommender Systems. She extensively worked on e-commerce websites, improving their performance by developing personalized models and evaluation systems.
Anna highly believes in innovation and research, keeping up-to-date with the latest academic studies and contributing to them. She participated in the European Conference of Information Retrieval 2022 with a poster on offline and online evaluation in the industry; and published a paper on improving interleaving techniques for the evaluation of information retrieval systems at the ECIR 2023.

How to Implement Online Search Quality Evaluation with Kibana

Anshum Gupta

Anshum is an Apache Lucene and Solr committer and Project Management Committee member. He started dabbling with Lucene over 15 years ago, and since then has worked at various organizations building both internal and consumer facing search platforms on top of Lucene and Solr. He is currently a part of the ACS Open Source Solr team helping groups across Apple with their search infrastructure.

Cross Data Center Replication in Solr - A new approach

Atita Arora

Atita has been working to develop, customize, and optimize Enterprise & E-commerce search engines for many years. She is an active contributor to many open-source tools. She holds 2 Masters degrees in Computer Applications and Strategic Business Management. She has worked and supported in many different roles in various organizations and even founded a small Search consultancy in India in 2017.
She has a keen interest in personalizing search and influencing customer interaction using NLP, ML, and AI.

Vectorize Your Open Source Search Engine

Aydan Rende

Aydan Rende is a Senior Data Engineer within the platform team at Kleinanzeigen. In this role, she develops data pipelines and assists teams in facilitating their data requirements. Aydan began her professional journey at Kleinanzeigen as a Software Engineer, working with commercial products. Alongside her professional career, she is a Formula 1 fan who has an on & off relationship with Ferrari.

Migrate Data, <Mesh> in mind

Benjamin Dauvissat

Java developper for almost 20 years, I also look after various fields like DevOps or big data.
I work at Adelean, a french company specialized in search engines.
And when I'm not working, I like coding and testing new stuff.

Big data in the service of reliable news

Bertram Sändig

Bertram Sändig leads the Machine Learning team at ontolux, a brand of Neofonie GmbH. He works on the adaptation, optimization, and integration of large language models for ontolux's text analysis toolkit, translating current research results into usable applications for customers.

ML with Domain-Specific Ontology for IT Security Industry

Bhavani Ravi

Bhavani Ravi is an independent DataOps consultant who helps you setup scalable data infrastructures. She is also an avid technical blogger, Opensource enthusiast and Linkedin Learning Instructor

Apache Airflow in Production - Bad vs Best Practices

Bo Wang

Bo Wang is a senior Machine Learning engineer who's leading the development of Finetuner. He got his BSc from Lanzhou University, China, and MSc from TU Delft, the Netherlands with a background in multimedia information retrieval. He is the core developer of first wave semantic search framework MatchZoo, and also the developer of Jina Core & Docarray.

Model Fine-tuning For Search: From Algorithms to Infra

Byron Voorbach

Byron Voorbach has spent over a decade in the search domain, providing consultation to companies and aiding in implementing large-scale search systems. As the current Head of Sales Engineering at Weaviate, he collaborates with customers globally to harness the power of semantic search in their operations.
A regular conference speaker and active contributor to open-source projects, Byron enjoys tackling complex problems and venturing into diverse domains. His work also includes building projects demonstrating cutting-edge search technologies’ potential and functionality and committing new functionality to Weaviate.

From keyword to vector

Celeste Horgan

Building On-Ramps for Non-Code Contributors in Open Source

Charlie Hull

With over 20 years in the business of open source search, Charlie Hull helps companies across the world build powerful and accurate search engines as a Managing Consultant at OpenSource Connections, the search relevance people. He is co-author of the book 'Searching the Enterprise', a regular conference keynote speaker, prolific blogger and writer, and hosts and organises the Haystack conference series.

The Debate Returns (with more vectors) Which Search Engine?

Chris Hutchinson

Chris Hutchinson is Chief Customer Officer at TravelTime, a UK-based company that builds high performance mobility APIs that enable users to search location data using time instead of distance. Chris is responsible for ensuring that users get maximum value from the API at all stages of the customer journey, from testing to integration to production use.

Search saves lives: solving healthcare problems with search

Danica Fine

Danica Fine is a Senior Developer Advocate at Confluent where she helps others get the most out of their event-driven pipelines. In her previous role as a software engineer on a streaming infrastructure team, she predominantly worked on Kafka Streams- and Kafka Connect-based projects. She can be found on Twitter, tweeting about tech, plants, and baking @TheDanicaFine.

A Kafka Client’s Request: There and Back Again

Dennis Berger

Dennis Berger is a freelance software and infrastructure engineer. He started his career in the industry developing low-latency applications and infrastructures with deep knowledge of the operating system, kernel, and application code. Now he focuses on developing fast and resource-efficient applications across the stack, from the IO path in the kernel to the user space, using innovative and modern technologies.

Searching large data sets in (near) constant time

Fokko Driesprong

Fokko is an open-source enthusiast and member of the Apache Software Foundation. Committer on Apache {Avro, Parquet, Druid, Airflow, Iceberg} and currently working as an open-source developer for Tabular where he focuses on PyIceberg; a non-JVM implementation of Iceberg. In his free time, he spends most of his time with friends and family.

Tip of the Iceberg

Gal Bashan

Gal is the Director of Engineering at Epsagon, recently acquired by Cisco, working in the observability space with a focus on distributed tracing. Gal has a cyber-security background and experience in reverse engineering and network analysis. Gal was part of an elite army intelligence unit before joining Epsagon.

Platform Engineering is All About Product

Houston Putman

Houston is a Lucene/Solr PMC member and committer. He works at Apple on the Open Source Technologies team, developing Solr and creating a better ecosystem for it in the cloud. Previously Houston worked at Bloomberg, as a member of the Search Infrastructure team. He has degrees in Computer Science & Mathematics from The University of Texas at Austin.

Rethinking Autoscaling for Apache Solr using Kubernetes

Ilaria Petreti

After an initial experience in the healthcare sector, believing strongly in the power of Big Data and Digital Transformation, Ilaria earned a Master in Data Science.
Since joining the Sease team (in 2020), she has gained a diverse range of experiences through projects related to Machine Learning and Natural Language Processing for Information Retrieval systems.
Ilaria has been working on integrating Learning To Rank and Search Quality Evaluation in e-commerce ecosystems, with the goal of improving their performance and the relevance of search results.
Additionally, she is an active member of the information retrieval research community, regularly sharing her knowledge through blogs and talks, contributing to open-source projects, and participating in international conferences.

How to Implement Online Search Quality Evaluation with Kibana

Jason Gerlowski

A software engineer with 10+ years working on search, located on the East Coast in the U.S. I'm a longtime committer and PMC member on the Apache Lucene and Solr projects. Outside of tech, I enjoy reading and spending time outdoors with my family.

A Fresh Start? The Path Toward Apache Solr's v2 API

Javier Ramirez

As a Developer Advocate at QuestDB, I help developers make the most of their (fast) data, I make sure the core team behind QuestDB listens to absolutely every piece of feedback I get, and I facilitate collaboration in our open source repository.

I love data storage, big and small. I have extensive experience with SQL, NoSQL, graph, in-memory databases, Big Data, and Machine Learning. I like distributed, scalable, always-on systems.

Ingesting over 4 million rows a second on a single instance

Jennifer Ding

Jennifer Ding is a Research Application Manager at The Alan Turing Institute, the UK’s national institute for data science and artificial intelligence. Previously, she was a startup founder and data scientist at several public interest tech companies, creating data products for industry and government partners. She enjoys massaging data big and small, and is co-leading the first ever London Data Week, which takes place 3 -9 July 2023.

What defines the “open” in “open AI”?

Jo Kristian Bergum

Jo Kristian is a Distinguished Engineer @Yahoo, where he spends his time working on the open-source Vespa.ai serving engine. Jo Kristian has 20 years of experience with deploying search systems at scale.

Boosting Ranking Performance with Minimal Supervision

Julien Jakubowski

Julien Jakubowski is a Developer Advocate at StreamNative with over 20+ years of experience as a developer, staff engineer, and consultant. He has built several complex systems with distributed, scalable, and event-driven architecture for various industrial sectors such as retail, finance, and manufacturing.

Julien delivers talks at conferences on software engineering: Devoxx, Java User Groups, and Google Developer Groups, among others.

Julien is also one of the founders and leaders of the Ch'ti JUG - Java User Group of Lille, France.

Scalable distributed messaging&streaming with Apache Pulsar

Kacper Łukawski

Kacper Łukawski is a Developer Advocate at Qdrant - an open-source neural search engine. Recently he’s been exploring the world of similarity learning and vector search.

ChatGPT is lying, how can we fix it?

Khosrow Ebrahimpour

I'm a Production Engineering manager at Shopify, where I lead the search platform team. Prior to that, I've worked at public companies and government organizations focusing on infrastructure handling large scale data.

Highly Available Search at Shopify

Lara Perinetti

Machine Learning Engineer focused on NLP and IR @Qwant.

Privacy-Preserving Web Search

Lars Albertsson

Lars Albertsson is the founder of Scling, a data engineering startup based in Stockholm. Scling provides data-factory-as-a-service - customer tailored data engineering, analytics, and data science. Lars is a frequent conference speaker on data engineering and data strategy. Before founding Scling, Lars has worked at Google, Spotify, Schibsted, and as an independent consultant, helping organisations create business value from data processing and machine learning.

How to not kill people

Lucian Precup

Lucian Precup is the CTO of all.site - the collaborative search engine developed at Station F in Paris. With his colleagues at Adelean, Lucian develops solutions for indexing, searching and analyzing data. Lucian regularly shares his knowledge in specialized conferences and organizes the Search & Data Meetup.

Towards a decentralized and collaborative search engine

Maciej Obuchowski

Maciej is a software engineer at GetInData and OpenLineage commiter. He loves contributing to open source projects and playing with cats.

Column-level lineage is coming to the rescue

Maish Saidel-Keesing

A public speaker, a creator of things, a writer of books, a contributor to community, and yes, also an ambulance driver. Senior Developer Advocate @AWS
Maish Saidel-Keesing is a Senior Enterprise Developer Advocate @AWS working on containers and has been working in IT for the past 20 years and with a stronger focus on cloud and automation for the past 7.

He has extensive experience with AWS Cloud technologies, DevOps and Agile practices and implementations, containers, Kubernetes, virtualization, and modern applications.

He is constantly trying to bridge the gap between Developers and Operators to allow all of us provide a better service for our customers (and not wake up from pages in the middle of the night). He is an avid practitioner of dissolving silos - educating Ops how to code and explaining to Devs what the hell is Operations.

Automation is the way things should be done - and he is constantly looking for ways to make life easier wherever he can.

Creating chaos in containers

Malte Pietsch

Malte is Co-Founder & CTO at deepset, where he builds Haystack - an open source framework that lets you quickly build production-ready NLP services for semantic search, question answering & more. He holds a M.Sc. with honors from TU Munich and conducted research at Carnegie Mellon University. Before founding deepset he worked as a data scientist for multiple startups. He is an open-source lover, likes reading papers before breakfast, and is obsessed with automating the boring parts of our work.

Connect GPT with your data: Retrieval-augmented Generation

Marija Selakovic

Marija Selakovic is a developer advocate at Crate.io, working with the CrateDB database and various other data engineering tools. She holds a Ph.D. degree in computer science from TU Darmstadt and a Master's degree in software engineering from VU University Amsterdam. As a developer advocate, Marija builds various technical content, speaks at developer conferences, and helps other software developers be productive and successful in using CrateDB.

When ms matter: Maximizing query performance in CrateDB

Mark Miller

Cross Data Center Replication in Solr - A new approach

Martin Bayton

Martin is currently Director of International Marketing at Pureinsights and is an evangelist for search and data analytics. He joined Pureinsights from Accenture where he worked for the Search & Content Analytics Group in Applied Intelligence. Prior to Accenture, he was Senior Manager Global OEM Partner Marketing at Qlik. Martin also spent close to 10 years working for the enterprise search vendor Convera.
Martin holds an MBA from Nottingham Business School and a BSc in Mechanical Engineering from Nottingham Trent University.

Using Dense Vector search at the EU Publications Office

Matt Williams

Matt Williams is a Principal Engineer at Cookpad, the world's largest recipe sharing platform, where he specialises in search and discovery. He has over six years of experience in building discovery experiences, with a particular interest in NLP, ML, scaleable search and recommendation, and the team structures and processes that underpin effective relevance improvement.

Prior to joining Cookpad in 2019, Matt worked as a Data Scientist and ML Engineer in industry, with applications to social media analysis and real-time news analysis. Before entering the software industry, Matt was a research scientist in academia, focusing on network science and predictive user modelling. He holds a PhD in Computer Science from Cardiff University, where he was also a lecturer.

Cooking up a new search system: Recipe search at Cookpad

Maximilian Werk

I enjoy bringing machine learning into production at Jina.ai as Head of Engineering. The combination of high quality engineering, digging into data and the real-world problem at hand thrills me.

Model Fine-tuning For Search: From Algorithms to Infra

Milind Shyani

Milind Shyani is an applied scientist at Amazon Web Services working on language models and machine learning algorithms. He is a theoretical physicist by training and received his Ph.D. from Stanford University.

Supercharging your transformers with synthetic query generation and lexical search

Natali Vlatko

Natali Vlatko (she/her) is the SIG Docs Co-Chair for the Kubernetes project and plays on the fun computer in her spare time. Her academic background is in Egyptology and Archaeology; specifically, burial customs across the various kingdoms of Ancient Egypt. Ask her about dead stuff.

Building On-Ramps for Non-Code Contributors in Open Source

Nick Burch

Nick is heavily involved in a number of Apache projects, such as Tika and POI, while having the fortune to know many of the people involved in the Apache Big Data and Search space! When not helping out with Apache things, Nick works as the Director of Engineering at FLEC, where he leads a team making heavy use of Open Source technologies. When not helping improve the logistics industry, he is often to be found attending or organising BarCamps, Geek Nights, or other such fun events dedicated to sharing what's great and new!

Barcamp
Laptop-sized ML for Text, with Open Source

Olena Kutsenko

Olena is a seasoned expert in data, sustainable software development, and teamwork. With a background in software engineering, she's led teams and developed mission-critical applications at Nokia, HERE Technologies, and AWS. Currently, she works at Aiven where she supports developers and customers in using open-source data technologies such as Apache Kafka, ClickHouse, and OpenSearch. She is also an international public speaker and regularly present at conferences around the world. She holds AWS Developer and Solutions Architect certifications, and is also a Confluent Catalyst.

ClickHouse: what is behind the fastest columnar database

Paweł Leszczyński

Pawel (@pawel-big-lebowski on github) is OpenLineage contributor. As a data practitioner with decade long experience, he focuses on converting data processing logs and metrics into meaningful observability insights.

Column-level lineage is coming to the rescue

Philipp Krenn

Philipp lives to demo interesting technology. Having worked as a web, infrastructure, and database engineer for over ten years, Philipp is now a developer advocate and EMEA team lead at Elastic — the company behind the Elastic Stack consisting of Elasticsearch, Kibana, Beats, and Logstash. Based in Vienna, Austria, he is constantly traveling Europe and beyond to speak and discuss open source software, search, databases, infrastructure, and security.

Catch the fraud — with observability and analytics

Qi Wu

Qi Wu, Machine Learning Engineer at ontolux, a brand of Neofonie GmbH, works on topics such as training and optimizing models, with a focus on finetuning and distillation, and translates current research results into usable applications for customers. During her master studies in statistics, she has already worked with Prof. Dr. Alan Akbik on the NLP framework FLAIR and worked on ML in the area of natural language processing.

ML with Domain-Specific Ontology for IT Security Industry

Quentin Herreros

Throughout my career, I have worked on diverse subjects such as medical resonance imaging, infra-red sensor characterization, and predicting carbon footprint in buildings using machine learning. I have been working on natural language processing for three years and I joined Elastic nine months ago.

How to train your general purpose document retriever model

Radu Gheorghe

Radu Gheorghe works mainly as a search consultant at Sematext, working with clients of all sizes on their Elasticsearch, OpenSearch and Solr projects. He is also a trainer and does production support for both these search engines.

Sometimes he helps out with the development of Sematext Cloud (an observability SaaS), mostly when it comes to Elasticsearch and log shippers (e.g. Logstash, rsyslog...). He also writes on the Sematext blog or helps other publish new articles.

He co-authored a book (Elasticsearch in Action, Manning), recorded a video tutorial (Working with Elasticsearch, O'Reilly) and was a speaker at a number of conferences, such as Berlin Buzzwords, LuceneSolrRevolution (later Activate) and Kubecon.

Using TensorFlow in a Solr Query Parser

Radu Pop

Radu provides Consulting Services as Solutions Architect at Adelean. He handles projects around Elasticsearch and Adelean’s A2 search technology. He oversees the integration and evolution of search engines within large e-commerce platforms, marketplaces or organizations' data lakes. Prior to joining Adelean, Radu acquired a solid experience in Web archiving, operating large scale crawling systems in the context of several European research projects. He holds a PhD in Computer Science and a MSc in Distributed Systems.

Big data in the service of reliable news

Rafał Kuć

Software engineer, trainer, consultant and author from time to time - some would say that he is an all in one battle weapon concentrated mostly on Lucene, Solr and Elasticsearch. Currently an Engineering Lead in Archipelo. However he also likes all the other cool stuff that is happening in the IT world. Likes to share his knowledge by giving talks at various meet ups and conferences.

Using TensorFlow in a Solr Query Parser

Ram Mohan Rao Chukka

Ram, Software Developer@JFrog. Previously worked for startup companies like CallidusCloud (SAP Company), Konylabs. Loves Automation, Linux, openSource

Who broke the build? -Using Kuttl to test and Release faster

Robert Metzger

Robert Metzger is a committer and PMC member at Apache Flink and a Staff Engineer at decodable. He previously co-founded and successfully exited data Artisans (now Ververica), the company originally creating and commercializing Flink. He is a frequent speaker at conferences such as the QCon, ApacheCon and meetups around the world.

Tiny Flink — Minimizing the memory footprint of Apache Flink

Roman Grebennikov

Principal Engineer at Delivery Hero SE, working on search personalization and recommendations. A pragmatic fan of functional programming, learn-to-rank models and performance engineering.

Learning to hybrid search

Ryan Ginstrom

I am a machine learning engineer at Mercari. I live and work in Japan. My professional interest these days is using machine learning in production at scale, and the special challenges this poses.

Building MLOps Infrastructure at Japan's Largest C2C E-Commerce Site

Savannah Norem

Currently a Developer Advocate at Redis, Savannah has a love for talking about all that technology can (and can't) do for people. When she's not live stream coding, or working on examples to help others get answers faster, she's either crafting, gardening, or hanging out with her husband and their cats.

When Probably is Good Enough

Shikhar Srivastava

Shikhar is a Software Engineer on the News Search Infrastructure Engineering team at Bloomberg in London. He has worn multiple hats in his professional career, from developing ETA prediction machine learning models for startups in India to developing low latency, financial market data systems at Bloomberg. He recently started dabbling with Apache Solr and has fallen in love with it

No Mean Feat: Upgrading a Customized Solr to Upstream Solr

Sophie Watson

Sophie is a data scientist at Nvidia where she focuses on tools and techniques for accelerating data science and machine learning workflows and workloads. She has previously worked to help customers build machine learning systems in the hybrid cloud. She’s a frequent public speaker on topics including machine learning workflows on Kubernetes, recommendation engines, and MLOps. Sophie earned her PhD in Bayesian statistics.

Avoiding Anti-patterns in Technical Communication

Stanimira Vlaeva

Stanimira Vlaeva is a Developer Advocate at MongoDB and a Google Developer Expert for Angular. She is passionate about explaining complex technical topics in an understandable way, live-coding, and contributing to open-source software. Her Twitter DMs are always open!

Advanced Search Plays with GraphQL

Stefan Sprenger

Stefan is co-founder and CEO at DataCater GmbH, the company behind the real-time ETL platform based on Apache Kafka. He has more than 10 years of experience in software and data engineering and researched database systems on modern hardware during his PhD studies.

A Crash Course in Error Handling for Streaming Data Pipeline

Stefano Fiorucci

Always passionate about computer science 💻, Stefano approached Machine Learning after receiving an education in engineering.

His interest in Machine Learning comes from how the field sits at the intersection of scientific research and software craftsmanship. Over time, Stefano gained a deep understanding of Natural Language Processing and Information Retrieval.

Lately, he has been fascinated by the vibrant field of neural/semantic/vector search 🔎, and enjoys contributing to open source projects in this field.

Fact Checking Rocks: how to build a fact-checking system

Steve Loughran

Steve Loughran is a developer at Cloudera where he focuses on Hadoop and Cloud Integration. Prior to joining Cloudera he was a research scientist at HP Laboratories, where he was involved in the early Ubiquitous Computing/Wearable Computing work. This is why the failure of the smart home is such a disappointment. For fun he falls off bicycles -which is why he spent December 2021 shouting at lightbulbs while waiting for his broken collarbone to heal.

Hadoop Vectored IO: your data just got faster!
Alexa, is The Smart Home vision failing?

Stéphane Campinas

Stéphane completed his Ph.D. studies at the University of Galway (Ireland) working on an Information Retrieval engine for Linked Data and became deeply interested in that field. He then transitioned to working for Siren, a spin-off of that research endeavor. From that point on, he worked on Federate, an Elasticsearch plugin for computing joins between inverted indices, and was responsible for maintaining and developing various parts, e.g., from the query planner to the interactions with Lucene like with the query cache.

Deep dive into an Elasticsearch plugin for query-time joins

Suman Karumuri

Suman Karumuri is a Principal Software Engineer and the tech lead for Observability at Airbnb. As an expert in distributed tracing, Suman has been a tech lead of Zipkin and a co-author of the OpenTracing standard, a Linux Foundation project under the CNCF. With extensive experience, Suman has spent years building and operating petabyte-scale log search, distributed tracing, and metrics systems at notable companies like Slack, Pinterest, Twitter, and Amazon. In his leisure time, Suman enjoys engaging in board games, exploring the outdoors through hiking, and spending quality time with his children.

Kaldb: serverless lucene at petabyte scale

Teo Narboneta Zosa

Teo is a machine learning engineer in the AI & Search division of Mercari, Japan’s largest C2C marketplace. He is currently working across various business-critical projects and helping establish foundational MLOps processes and best practices across the org.

Building MLOps Infrastructure at Japan's Largest C2C E-Commerce Site

Tom Veasey

Tom Veasey has worked at Elastic since September 2016. He is a member of the machine learning team. He started out as a data scientist working on satellite control, phased array radar and drug discovery projects. He then had detours into Electronic Design Automation and FX derivatives pricing. He studied Physics at the University of Cambridge.

How to train your general purpose document retriever model

Tomáš Neubauer

Tomas Neubauer is a co-founder and the CTO at Quix, works as a technical authority for the engineering team and is responsible for the direction of the company across the full technical stack. He was previously technical lead at McLaren, where he led architecture uplift for Formula 1 racing real-time telemetry acquisition. He later led platform development outside motorsport, reusing the know-how he gained from racing.

Building Real-Time Applications: Cyclist Crash Detection

Torsten Bøgh Köster

Torsten is a freelance search & operations engineer with a focus on open-source search, container, and cloud technology. He tweaks Apache Solr installation in the cloud and on bare-metal with a focus on observability.

Searching large data sets in (near) constant time

Tudor Golubenco

Tudor is CTO at Xata, a modern serverless database that provides extra data functionality like AI, search, or image transformations. Previously, he had worked at data companies like Elastic and Oracle.

Semantic vs keyword search as context for GPT

Uwe Schindler

Uwe is committer and PMC member of Apache Lucene and Apache Solr. His main focus is on development of Lucene Core. He implemented fast numerical search and is maintaining the new attribute-based text analysis API. He studied Physics at the University of Erlangen-Nuremberg and works as managing director for SD DataSolutions GmbH in Bremen, Germany, a company that provides consulting and support for Apache Lucene, Elasticsearch, and Apache Solr. He also works for “PANGAEA – Publishing Network for Geoscientific & Environmental Data” where he implemented the portal's geo-spatial retrieval functions with Lucene Java. Uwe had talks about Lucene at various international conferences like the previous Berlin Buzzwords, ApacheCon EU/US, Lucene Revolution, Lucene Eurocon, and various local meetups.

What's coming next with Apache Lucene?

Vsevolod Goloviznin

Software engineer in the past, switched tracks to work closer with customers and product. Has multi-year experience of communicating with customers to understand what they really want and translating this information to engineers as a Head of Product.

Learning to hybrid search

William Benton

William Benton is passionate about making it easier for machine learning practitioners to benefit from advanced infrastructure and making it possible for organizations to manage machine learning systems. His recent roles have included defining product strategy and professional services offerings related to data science and machine learning, leading teams of data scientists and engineers, and contributing to many open source communities related to data, ML, and distributed systems. Will was an early advocate of building machine learning systems on Kubernetes and developed and popularized the “intelligent applications” idiom for machine learning systems in the cloud. He has also conducted research and development related to static program analysis, language runtimes, cluster configuration management, and music technology.

Synthetic data: when, why, and how

Yingjun Wu

Yingjun Wu is the founder of RisingWave Labs, the company developing RisingWave, a distributed SQL database for stream processing. Before running the company, Yingjun was a software engineer at the Redshift team, Amazon Web Services, and a researcher at the Database group, IBM Almaden Research Center. Yingjun received his PhD degree from National University of Singapore, and was a visiting PhD at Carnegie Mellon University. He has been working in the field of stream processing and database systems for over a decade.

Joining Dozens of Data Streams in Distributed Stream Processing Systems

Zhibo Li

Zhibo Li is a final-year PhD student of Informatics at the Informatics School, University of Edinburgh. His research interests include System & Architecture, especially in Data-Centric Parallelism, Compiler, and Programming Model. He is currently working on a Property-based Collection Skeletons library.

Declarative Data Collections for Portable Parallelism