2026-06-09 –, Kesselhaus
Agentic systems can break not because information is missing, but because persuasively wrong context gets promoted into action. We examine a recurring pattern: retrieval metrics improve while agent behavior degrades as distractors enter multi-step loops. We show why relevance, reliability, and security are tightly connected in agentic retrieval.
In agentic workflows, retrieval is no longer just ranking for a human reader; it is context injection into reasoning and tool use. That shift changes the failure mode. Plausible but incorrect evidence can degrade outcomes disproportionately, and in noisy settings, longer reasoning can make answers worse rather than better. This is inverse scaling under noise: more capable reasoning produces more confident mistakes. In iterative agent loops, those mistakes are recycled and amplified, turning small retrieval defects into workflow-level failures.
In this talk we'll break down the main failure modes, including plausible distractors, error compounding across steps, and the gap between traditional retrieval metrics and real task utility. We'll present design patterns for robust agentic retrieval: stricter evidence selection, sufficiency checks before acting, and explicit pause/retry/escalate behavior when confidence is not warranted. We'll also connect these patterns to challenges in open agent tooling ecosystems, where untrusted context has shown that retrieval is a threat surface as well as a ranking problem.
Lester Solbakken is a Founding Engineer at HORNET.dev, where he builds production-grade retrieval infrastructure for AI agents. Previously pursued a PhD within Artificial Intelligence and Machine Learning, with research centered on neural networks, exploratory data analysis and self-organizing systems. He speaks about building reliable, high-performance AI systems that bridge research and real-world deployment.