ml:scaling_laws
This is an old revision of the document!
Table of Contents
Scaling Laws
Scaling laws are used to pick optimal hyperparameters for large models.
Papers
Training LLMs
- Large models are usually trained with scaling laws in mind (often compute optimal for deployment, not training). See for example:
- Google 2023 - PaLM 2 Technical Report (see section 2)
Emergent Abilities
Related Pages
ml/scaling_laws.1719008000.txt.gz · Last modified: 2024/06/21 22:13 by jmflanig