Berlin Buzzwords 2026

Constant-Time Aggregations with Star-Tree in OpenSearch
2026-06-08 , Maschinenhaus

Discover how OpenSearch breaks linear scaling. Inspired by Apache Pinot, the Star-Tree index moves performance dependency from document count to field cardinality. Learn how we extended Lucene’s DocValues to build multi-dimensional materialized views that deliver sub-second analytics on billion-scale datasets for observability workloads.


Traditional distributed search engines face a significant bottleneck: aggregation latency scales linearly with document count. As datasets grow to billions of records, this "scan-on-query" model fails to meet real-time requirements. Inspired by Apache Pinot and star-cube research, OpenSearch introduced the Star-Tree index to decouple performance from raw data volume.

This session dives into the engineering behind this transition. We will explore how we extended Lucene’s DocValuesFormat to support multi-field materialized views directly within segment structures and shifted the performance dependency from total document count to the cardinality of indexed dimensions.

We will detail the implementation of "star nodes"—wildcard structures representing aggregates across all values of a dimension—and how they enable constant-time query pruning. Attendees will learn about the challenges of building a multi-field index in the single-field-centric ecosystems like Lucene, how to fine tune storage versus speed, and why this architectural shift achieved up to 1000x faster queries. Finally, we will discuss operational lessons learned, including limitations on why this structure is limited to append-only workloads and how it bridges the gap between traditional search and OLAP-style analytics.


Level: Intermediate
See also: Star Tree Index

Sandesh is a Software Developer working on the OpenSearch Project, with a focus on enhancing search performance and cluster resilience. He is also a maintainer of OpenSearch (core) and hosts the Search Backlog & Triage Community Meeting every Wednesday, where he engages with the community to review open issues, prioritize feature requests, and drive improvements to OpenSearch's Search components.

Shailesh Kumar Singh is a Software Development Engineer at Amazon Web Services, working on OpenSearch. His work focuses on building high-performance analytics systems at scale, with contributions to aggregation optimization through Star Tree indexing and efficient data processing and compaction using Parquet. He is particularly interested in designing scalable systems that balance performance, storage efficiency, and real-world usability.

He holds a Bachelor’s degree in Computer Science from BITS Pilani, with a minor in Finance, and is interested in scalable systems and fintech.