Berlin Buzzwords 2025

How to train a fast LLM for coding tasks
2025-06-17 , Frannz Salon

Coding LLMs are now part of our daily work, making coding easier. In this talk, we share how we built an in-house LLM for AI code completion in JetBrains products, covering design choices, data preparation, training, and model’s evaluation.


In this talk, we present our approach to training a code completion model using Mellum, our new open-source model, as an example. Mellum powers in-file code completion in AI-enabled JetBrains IDEs. We'll walk through the entire process, from designing the model and preparing the dataset — with emphasis on the permissiveness of using data — to the training process and evaluation strategies. Attendees will gain insights into state-of-the-art techniques and the challenges we faced and discover practical approaches to optimizing AI models for real-world coding environments. This talk is relevant for developers and ML Engineers interested in ML feature development and custom model training.


Tags:

Data Science, Scale

Level:

Intermediate

Senior MLE@JetBrains

  • Training models which write code-related things