Berlin Buzzwords 2024

Learning to Rank for Reddit Search - A Project Retro
2024-06-10 , Kesselhaus

The successes (and failures) of applying Machine Learning to improve Reddit's search ranking


In today’s AI based world, Reddit stands out as a deep catalog of human, subjective information. Whether product reviews or the deeply personal - Reddit searchers want to connect with other humans, not generic AI based answers.

We at Reddit would like the site-search experience to be better, so you don’t need to add “Reddit” to your Google search. That’s what we’re trying to do with Learning to Rank: turning relevance into a repeatable, data-driven solution.

The journey hasn’t been an easy one. We want to share our painful lessons learned working with training data, developing features, the Solr Learning to Rank plugin, scaling Learning to Rank to 1000s of QPS, and more. Hopefully, you can learn from the egg we constantly found on our faces!

See how our scrappy, understaffed team has been slowly turning LTR from a science project into a repeatable process of constant, data-informed improvement. From a lab to an assembly line, come and learn from our painful lessons big and small.

Doug Turnbull has been enthusiastic about search relevance since 2013. He co-authored Relevant Search and AI Powered Search. He created Quepid and Splainer for search relevance testing. He co-created the Elasticsearch Learning to Rank plugin with Wikimedia Foundation and Snagajob. Doug loves learning from other search practitioners, and hopes you'll bring inquisitive curiosity and experiences to this talk.

Doug currently works at Reddit where he's helping bring Machine Learning to search. Recently Doug worked at Shopify to help improve merchant search attributed revenue by 19% year over year. Doug spent 8 years consulting at dozens of organizations improve search relevance during his time as CTO at OpenSource Connections.

Doug blogs about search and other topics at http://softwaredoug.com

Charles is a skilled technologist and innovator with a solid academic background from Brown University. With experience in both software engineering and finance, Charles has a proven track record of driving impactful changes in various fields.

At Zillow, Charles played a key role in the complete rebuild and redesign of the Zillow Search platform over several years, significantly improving its functionality and user experience. This achievement highlighted Charles' expertise in developing scalable, high-performance systems.

After transitioning to fintech, Charles used their technical skills to create one of the industry's first regulatory dispute products, marking a notable advancement in the regulatory technology landscape.

Currently, Charles is at the forefront of innovation at Reddit, contributing to Reddit's core infrastructure, specializing in Kubernetes and Solr-based solutions. Their work is focused on enhancing the performance and reliability of Reddit Search.

With a dedication to excellence and a forward-thinking approach, Charles continues to explore new possibilities in technology. Outside of work, Charles is an avid reader and an open water long-distance swimmer.