2025-06-17 –, Frannz Salon
Coding LLMs are now part of our daily work, making coding easier. In this talk, we share how we built an in-house LLM for AI code completion in JetBrains products, covering design choices, data preparation, training, and model’s evaluation.
In this talk, we present our approach to training a code completion model using Mellum, our new open-source model, as an example. Mellum powers in-file code completion in AI-enabled JetBrains IDEs. We'll walk through the entire process, from designing the model and preparing the dataset — with emphasis on the permissiveness of using data — to the training process and evaluation strategies. Attendees will gain insights into state-of-the-art techniques and the challenges we faced and discover practical approaches to optimizing AI models for real-world coding environments. This talk is relevant for developers and ML Engineers interested in ML feature development and custom model training.
Data Science, Scale
Level:Intermediate
Senior MLE@JetBrains
- Training models which write code-related things