Berlin Buzzwords 2024

Shattering the Limits of Search with Domain Specific Computing
2024-06-10 , Kesselhaus

The demand for advanced search and data retrieval capabilities is growing exponentially: The rise of AI applications, along with the unprecedented scaling of data is leading to workloads that are pushing traditional software based search to its limits.
As cloud computing costs are skyrocketing, and queries are becoming more complex, organizations are often forced to compromise results relevancy given the strict real-time latency requirements of 100ms.
To address this issue, a new dimension of search must be introduced: domain-specific computing; it focuses on designing a dedicated chip for search, to consistently achieve ten times faster search at a billion-scale, all at a fraction of the infrastructure cost.


Maintaining latency below 100ms at scale depends on several factors, such as the level of throughput (QPS), complexity of the queries, the hardware and network infrastructure, and the indexing strategy being used. With all of the above, maintaining real-time latency is extremely challenging at a large scale for a number of reasons:
• Millions to millions of documents are becoming standard
• Both BM25 and k-NN algorithms are compute intensive and requires extensive compute power at scale
• High query throughput (QPS) during surges in demand
• Real-time database Updates conducted on the DB in parallel to search queries
A major contributor to the underperformance of existing software based solutions is the fact that these solutions are dependent on general purpose CPUs, those are designed to handle a wide range of tasks, but they are not optimized for any specific task. The outcome is suboptimal search processing power.

This is where Domain-specific computing comes into play.
Domain-specific architectures are tailored for specific workloads, significantly boosting processing speeds by tens to hundreds of times. These purpose-built chips excel in performing thousands of computations simultaneously, making them ideal for high-throughput tasks like data processing and search indexing. Additionally, they can offload processing from the CPU, handling specific operations such as term matching, sorting, and filtering, which reduces CPU workload and enhances query performance.
Purpose built chips like FPGAs have become available in the cloud (e.g. AWS) and are compatible with Elasticsearch. It supports dynamic scaling to efficiently manage fluctuating traffic volumes and delivers rapid search results, enhancing user experiences and boosting customer satisfaction.
Overall, domain-specific computing pushes search and data retrieval capabilities to new heights, transcending traditional data distribution methods for efficiency. It minimizes software stack overhead and enables organizations to achieve the performance needed at scale and with consistency.




This session is sponsored by Hyperspace

A product and business expert, Ohad is a visionary product leader with over 15 years of experience in driving product strategies for disruptive technologies, building strong teams and engaging people around ideas. After leading product teams at Intel, HP and Click Software and launching enterprise-grade products in new markets, Ohad set out on a new journey. As the CEO and Founder of Hyperspace, Ohad is set on a mission to introduce an enterprise-grade, AI-search acceleration engine for companies making real-time predictions and facing performance and scalability challenges.