ml:scaling_laws
This is an old revision of the document!
Table of Contents
Scaling Laws
Scaling laws are used to pick optimal hyperparameters for large models.
Papers
- Large models are usually trained with scaling laws in mind (often compute optimal for deployment, not training). See for example:
- Google 2023 - PaLM 2 Technical Report (see section 2)
Related Pages
ml/scaling_laws.1711484025.txt.gz · Last modified: 2024/03/26 20:13 by jmflanig