2025-06-16 –, Maschinenhaus
Struggling to identify relevant filters among too many facets and frustrating results navigation? We explore an AI Filter Assistant for statistical data (SDMX) showing how LLMs can be leveraged to suggest the best filters for your natural language query, helping you refine the results in Apache Solr. We share wins, fails, and lessons learned.
In this talk, we explore an AI-powered Filter Assistant, designed for the Statistical Data and Metadata eXchange (SDMX) to improve User eXperience in navigating search results efficiently and effectively.
We discuss how LLMs enhance filter suggestions by analyzing both user queries and indexed data.
On the architecture side, we break down:
1) Data retrieval – how we collected and processed the input SDMX data to build taxonomies used by the model to reconcile the concepts in the natural language query
2) API structure – a deep dive into our endpoints, what they do, and the responses they return.
3) Model choice – the process of identifying the best LLM for the task, including our motivations and studies
4) Structured output & JSON Schema – key benefits, limitations, and lessons learned from extensive testing. We showcase different test results and insights on what works best.
5) Solr query optimization – how to integrate the assistant’s output into a search query, using different boolean strategies to handle the refinement of both too-many and zero-result scenarios.
Expect real-world insights, practical takeaways, and a discussion on the future of AI-driven filtering!
Search, Data Science, Stories
Level:Intermediate
Ilaria is a Data Scientist with a background in Machine Learning and Natural Language Processing for Information Retrieval systems. Since joining the Sease team in 2020, she has worked on various projects, focusing on integrating Learning To Rank and Search Quality Evaluation in e-commerce ecosystems. More recently, she has been exploring the potential of Vector Search and Large Language Models in Search, leveraging these technologies to enhance retrieval strategies and improve result relevance.
Beyond her work, she is an active information retrieval research community member, regularly sharing her insights through blog posts, contributing to open-source projects, and speaking at international conferences such as Berlin Buzzword and ElasticON.
Hi!
I’m Anna Ruggero, an IT consultant in the information retrieval world.
I support clients in the process of improving their search engines with the implementation of innovative personalized solutions. I specialize in the integration of machine learning techniques with information retrieval systems, from Learning-to-Rank techniques to Neural Searches and Recommender Systems.
I extensively worked on e-commerce websites, improving their performance by developing personalized models and evaluation systems.
I highly believe in innovation and research, keeping up-to-date with the latest academic studies and contributing to them. I participated in the European Conference of Information Retrieval 2022 with a poster on offline and online evaluation in the industry and published a paper on improving interleaving techniques for the evaluation of information retrieval systems at the ECIR 2023.
I can't wait to talk about search with you!
Edward Lambe is the Head of the MED Data Engineering team and the Deputy Head of MED IT at the Bank for International Settlements (BIS). Since joining the BIS in 2016, Edward has overseen the implementation of several key projects within the IT unit of the Monetary and Economic Department. Notably, he led the delivery of the BIS Data Portal, a core initiative of the BIS 2025 Innovation programme aimed at modernising the dissemination of BIS statistics. Prior to his tenure at the BIS, Edward held various statistical and IT roles at the Central Statistics Office, Ireland, and the Bank of Ireland. He holds a master’s degree from the Cork Institute of Technology and a bachelor’s degree from the National University of Ireland, Cork.