2025-06-16 –, Kesselhaus
Apache Solr 9.8 introduces the LLM module opening the doors of end-to-end natural language query support through vector-backed semantic search (K Nearest Neighbors).
This talk explores the open source contribution from both the indexing and query angles and what’s coming next for Solr in terms of integrations with Large Language Models.
Dense vector search was introduced in Apache Solr 9.0 in 2022 and since then it has received substantial adoption from the community.
Text vectorisation had to happen outside Solr, as there was no support to encode text to vector within the search engine transparently.
Apache Solr 9.8 changes this, introducing a module that allows interaction with well-known large language model providers such as OpenAI, Cohere, HuggingFace and Mistral AI via the open-source library
LangChain4j.
Expect to learn how to configure Solr to access external text vectorisation services and use them to encode and run your queries through the 'knn_text_to_vector' query parser and vectorise your documents’ textual fields through the 'Text To Vector Update Request Processor'.
This is a foundational enabler that speeds up the design and development of end-to-end semantic search solutions.
The talk wraps up with future directions and how the introduction of the LLM module opens the doors for exciting new integrations.
Join us as we dive into the AI future of Apache Solr!
Search
Level:Intermediate
Alessandro Benedetti is an Apache Lucene/Solr committer and Solr PMC member, Director at Sease Ltd.
He believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research.
Alessandro is a passionate R&D software engineer, continuously applying the latest trends in Information Retrieval and AI to solve search problems.
He’s been working on Learning To Rank for years and more recently he’s been exploring Generative AI techs like Large Language Models and Retrieval Augmented Generation.
When he isn't on clients' projects, he contributes to the open-source community and presents at meet-ups and conferences such as ECIR, Search Solutions, Community Over Code, Haystack and Berlin Buzzwords.