ml:scaling_laws
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:scaling_laws [2024/04/09 14:53] – [Papers] jmflanig | ml:scaling_laws [2025/06/01 23:09] (current) – [Related Pages] jmflanig | ||
|---|---|---|---|
| Line 5: | Line 5: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * **[[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| ==== Training LLMs ==== | ==== Training LLMs ==== | ||
| * Large models are usually trained with scaling laws in mind (often compute optimal for deployment, not training). | * Large models are usually trained with scaling laws in mind (often compute optimal for deployment, not training). | ||
| * [[https:// | * [[https:// | ||
| + | |||
| + | ==== Emergent Abilities ==== | ||
| + | See also [[nlp: | ||
| + | |||
| + | * GPT-3: [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * **[[https:// | ||
| + | * [[https:// | ||
| ===== Related Pages ===== | ===== Related Pages ===== | ||
| * [[Hyperparameter Tuning]] | * [[Hyperparameter Tuning]] | ||
| * [[nlp: | * [[nlp: | ||
| + | * [[nlp: | ||
| + | * [[nlp: | ||
ml/scaling_laws.1712674432.txt.gz · Last modified: 2024/04/09 14:53 by jmflanig