AutoScale-Automatic Prediction of Compute-optimal Data Composition for Training LLMs

May 1, 2025ยท
Feiyang Kang
,
Yifan Sun
Bingbing Wen
Bingbing Wen
,
Si Chen
,
Dawn Song
,
Rafid Mahmood
,
Ruoxi Jia
ยท 1 min read
Abstract
We present AutoScale, a method for automatically predicting compute-optimal data composition for training large language models, improving training efficiency and model performance.
Type
Publication
COLM 2025

We present AutoScale, a method for automatically predicting compute-optimal data composition for training large language models. Our approach improves training efficiency and model performance by optimizing the data mixing strategy during pretraining.