User Tools

Site Tools


ml:scaling_laws

Scaling Laws

Scaling laws are used to pick optimal hyperparameters for large models.

Papers

Training LLMs

  • Large models are usually trained with scaling laws in mind (often compute optimal for deployment, not training). See for example:

Emergent Abilities

See also Language Model - Origin of Capabilities.

ml/scaling_laws.txt · Last modified: 2025/06/01 23:09 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki