User Tools

Site Tools


ml:scaling_laws

This is an old revision of the document!


Scaling Laws

Scaling laws are used to pick optimal hyperparameters for large models.

Papers

Training LLMs

  • Large models are usually trained with scaling laws in mind (often compute optimal for deployment, not training). See for example:

Emergent Abilities

See also Language Model - Origin of Capabilities.

ml/scaling_laws.1748819352.txt.gz · Last modified: 2025/06/01 23:09 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki