ml:nn_training
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:nn_training [2022/05/27 03:39] – [Topics] jmflanig | ml:nn_training [2024/07/09 22:29] (current) – [Topics] jmflanig | ||
|---|---|---|---|
| Line 5: | Line 5: | ||
| * [[https:// | * [[https:// | ||
| * [[http:// | * [[http:// | ||
| + | * [[https:// | ||
| ===== Topics ===== | ===== Topics ===== | ||
| Line 18: | Line 19: | ||
| * [[Regularization]] | * [[Regularization]] | ||
| * [[Fine-Tuning]] and [[nlp: | * [[Fine-Tuning]] and [[nlp: | ||
| - | * [[NN Tricks|Misc Tricks]] | + | |
| * Tricks such as [[Curriculum Learning]], etc | * Tricks such as [[Curriculum Learning]], etc | ||
| * [[nlp: | * [[nlp: | ||
| * Residual connections, | * Residual connections, | ||
| + | * [[https:// | ||
| * [[Large-Scale]] and [[Distributed Training]] | * [[Large-Scale]] and [[Distributed Training]] | ||
| Line 33: | Line 35: | ||
| * Transformer: | * Transformer: | ||
| * **[[https:// | * **[[https:// | ||
| + | * TODO: GPT-1 (p. 5) | ||
| * Low-resource NMT system: [[https:// | * Low-resource NMT system: [[https:// | ||
| - | * BART | + | * TODO: BART |
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| ^ Paper ^ Architecture ^ Optimizer ^ Optimizer Hyperparameters ^ Initialization ^ Normalization ^ Regularizer ^ Learning Schedule ^ Stopping Criterion ^ Activation Function ^ Tokenization ^ Extras ^ | ^ Paper ^ Architecture ^ Optimizer ^ Optimizer Hyperparameters ^ Initialization ^ Normalization ^ Regularizer ^ Learning Schedule ^ Stopping Criterion ^ Activation Function ^ Tokenization ^ Extras ^ | ||
ml/nn_training.1653622757.txt.gz · Last modified: 2023/06/15 07:36 (external edit)