AutoScale-Automatic Prediction of Compute-optimal Data Composition for Training LLMs
May 1, 2025ยท,
,,,,ยท
1 min read
Feiyang Kang
Yifan Sun
Bingbing Wen
Si Chen
Dawn Song
Rafid Mahmood
Ruoxi Jia

Abstract
We present AutoScale, a method for automatically predicting compute-optimal data composition for training large language models, improving training efficiency and model performance.
Type
Publication
COLM 2025
We present AutoScale, a method for automatically predicting compute-optimal data composition for training large language models. Our approach improves training efficiency and model performance by optimizing the data mixing strategy during pretraining.