Table of Contents

Scaling Laws

Scaling laws are used to pick optimal hyperparameters for large models.

Papers

Training LLMs

Emergent Abilities

See also Language Model - Origin of Capabilities.