ml:optimizers
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:optimizers [2022/09/23 11:03] – [Modern Deep Learning Optimizers] jmflanig | ml:optimizers [2025/03/26 20:02] (current) – [Second-Order Optimizers] jmflanig | ||
|---|---|---|---|
| Line 3: | Line 3: | ||
| ===== Survey Papers ===== | ===== Survey Papers ===== | ||
| * Introduction: | * Introduction: | ||
| - | * [[https:// | + | |
| - | * [[https:// | + | |
| - | * Blog post: [[https:// | + | * **[[https:// |
| - | * Blog post about Adam, AdamW, and AMSGrad: [[https:// | + | * [[https:// |
| + | | ||
| + | | ||
| + | | ||
| + | * [[https:// | ||
| + | * Blog post about Adam, AdamW, and AMSGrad: [[https:// | ||
| ===== First-Order Optimizers ===== | ===== First-Order Optimizers ===== | ||
| Line 23: | Line 28: | ||
| * Nadam [[https:// | * Nadam [[https:// | ||
| * AdamW: [[https:// | * AdamW: [[https:// | ||
| + | * [[https:// | ||
| * RAdam: [[https:// | * RAdam: [[https:// | ||
| * EAdam: [[https:// | * EAdam: [[https:// | ||
| Line 28: | Line 34: | ||
| * Apollo: [[https:// | * Apollo: [[https:// | ||
| * [[https:// | * [[https:// | ||
| - | * Generalized SignSGD: [[https:// | + | * Generalized SignSGD: [[https:// |
| + | * **Lion**: [[https:// | ||
| + | * **Muon**: [[https:// | ||
| + | * Background on norms: [[https:// | ||
| + | * Applied to larger scale LLM training: [[https:// | ||
| ==== Provably Linearly-Convergent Optimizers ==== | ==== Provably Linearly-Convergent Optimizers ==== | ||
| Line 36: | Line 46: | ||
| * AMSGrad [[https:// | * AMSGrad [[https:// | ||
| * JacSketch [[https:// | * JacSketch [[https:// | ||
| + | |||
| ==== Variance Reduction Techniques ===== | ==== Variance Reduction Techniques ===== | ||
| Line 68: | Line 79: | ||
| * [[https:// | * [[https:// | ||
| * Apollo: [[https:// | * Apollo: [[https:// | ||
| + | * [[https:// | ||
| ===== Gradient-Free Optimizers ===== | ===== Gradient-Free Optimizers ===== | ||
ml/optimizers.1663931038.txt.gz · Last modified: 2023/06/15 07:36 (external edit)