How to train a fast LLM for coding tasks Berlin Buzzwords 2025

How to train a fast LLM for coding tasks
.ical

2025-06-17 16:00–16:40, Frannz Salon

Coding LLMs are now part of our daily work, making coding easier. In this talk, we share how we built an in-house LLM for AI code completion in JetBrains products, covering design choices, data preparation, training, and model’s evaluation.

In this talk, we present our approach to training a code completion model using Mellum, our new open-source model, as an example. Mellum powers in-file code completion in AI-enabled JetBrains IDEs. We'll walk through the entire process, from designing the model and preparing the dataset — with emphasis on the permissiveness of using data — to the training process and evaluation strategies. Attendees will gain insights into state-of-the-art techniques and the challenges we faced and discover practical approaches to optimizing AI models for real-world coding environments. This talk is relevant for developers and ML Engineers interested in ML feature development and custom model training.

Tags:

Data Science, Scale

Level:

Intermediate

Ivan Dolgov

Senior MLE@JetBrains

Training models which write code-related things

How to train a fast LLM for coding tasks .ical 2025-06-17 16:00–16:40, Frannz Salon

How to train a fast LLM for coding tasks
.ical

2025-06-17 16:00–16:40, Frannz Salon