🍊 Latent Atlas 🍉

❯

❯

❯

Training Compute Optimal Large Language Models

Training Compute-Optimal Large Language Models

2026年5月31日1分钟阅读

source
paper
chinchilla
compute-optimal
scaling-law

基本信息

Title: Training Compute-Optimal Large Language Models
Source type: paper
Related topic notes: Compute Optimal, Scaling Law, Pretraining Compute Optimal

TODO

阅读论文原文，整理 Chinchilla 的 compute-optimal scaling 设定、实验方法和核心结论。
回填参数量与训练 token 数应更均衡增长的经验规律，以及常见 20 tokens/parameter 启发式的边界。
对照 Kaplan scaling law，整理 undertrained large model 的诊断方式与实践影响。

关系图谱

基本信息
TODO

反向链接

Papers
Compute Optimal
Scaling
Scaling Law

🍊 Latent Atlas 🍉 · An AI knowledge atlas built with Quartz © 2026