Berlin Buzzwords 2024

Harnessing Spare Cores to Breeze Through Cloud Compute
06-11, 09:30–09:50 (Europe/Berlin), Frannz Salon

Spare Cores, a Python-based open-source ecosystem, offers standardized inventory and performance evaluations for compute resources of public cloud and server providers, helping with selecting and starting the optimal instance type for containerized tasks, such as ML model training or ETL processes.


Spare Cores, an innovative Python-based open-source ecosystem, provides a comprehensive and standardized inventory, along with performance evaluations of available compute resources across public cloud and server providers.

In this talk, we demonstrate how Spare Cores can help identify the optimal instance types across various vendors and datacenters, ensuring efficiency for containerized tasks (such as training machine learning models using TPUs, rendering videos on GPUs, executing ETL processes requiring a lot of RAM, or running computationally heavy microservices), using either our public or self-hosted web application, Swagger/OpenAPI-documented APIs, or the open-source SDKs for several programming languages, accompanied by user-friendly CLI helpers to launch instances within your existing environment effortlessly.

We also briefly showcase a streamlined SaaS solution for those seeking simplicity and/or unwilling to manage their cloud infrastructure. The managed Spare Cores environment covers the entire life cycle of batch jobs and microservices, eliminating the need for direct vendor engagement.

Spare Cores is a new project started in Q4 2023, part of the NGI Search Open Call #3 https://www.ngisearch.eu/view/Events/OC3Searchers

See also: Slides

Gergely Daroczi is an enthusiast R user and package developer for 20 years; Ph.D. in social sciences; former Assistant Professor in Sociology, currently Lecturer at the Business Analytics program of CEU; 15+ years of industry experience in data science, engineering, cloud infrastructure, and data operations at SaaS, fintech, adtech, and healthtech startups with a strong interest in building scalable data platforms. He maintains a dozen open-source packages related to using R in production (automated reports, logging, database connections, API integrations), contributed to Python packages, co-authored several journal articles in social and medical sciences, and wrote a book on "Mastering Data Analysis with R".