ml:nn_training
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:nn_training [2022/03/02 19:17] – [Topics] jmflanig | ml:nn_training [2024/07/09 22:29] (current) – [Topics] jmflanig | ||
|---|---|---|---|
| Line 5: | Line 5: | ||
| * [[https:// | * [[https:// | ||
| * [[http:// | * [[http:// | ||
| + | * [[https:// | ||
| ===== Topics ===== | ===== Topics ===== | ||
| Line 12: | Line 13: | ||
| * [[NN Initialization|Initialization]] | * [[NN Initialization|Initialization]] | ||
| * [[Normalization]] | * [[Normalization]] | ||
| - | * Parameter Learning: see [[Optimizers]] | + | * [[Optimizers]] |
| - | * Choosing the Learning Rate | + | * [[Learning Rate]] |
| * [[https:// | * [[https:// | ||
| - | * See also [[Learning Rate]] | ||
| * [[Loss Functions]] | * [[Loss Functions]] | ||
| * [[Regularization]] | * [[Regularization]] | ||
| * [[Fine-Tuning]] and [[nlp: | * [[Fine-Tuning]] and [[nlp: | ||
| - | * [[NN Tricks|Misc Tricks]] | + | |
| * Tricks such as [[Curriculum Learning]], etc | * Tricks such as [[Curriculum Learning]], etc | ||
| - | * [[nlp: | + | * [[nlp: |
| * Residual connections, | * Residual connections, | ||
| + | * [[https:// | ||
| * [[Large-Scale]] and [[Distributed Training]] | * [[Large-Scale]] and [[Distributed Training]] | ||
| Line 34: | Line 35: | ||
| * Transformer: | * Transformer: | ||
| * **[[https:// | * **[[https:// | ||
| + | * TODO: GPT-1 (p. 5) | ||
| * Low-resource NMT system: [[https:// | * Low-resource NMT system: [[https:// | ||
| - | * BART | + | * TODO: BART |
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| ^ Paper ^ Architecture ^ Optimizer ^ Optimizer Hyperparameters ^ Initialization ^ Normalization ^ Regularizer ^ Learning Schedule ^ Stopping Criterion ^ Activation Function ^ Tokenization ^ Extras ^ | ^ Paper ^ Architecture ^ Optimizer ^ Optimizer Hyperparameters ^ Initialization ^ Normalization ^ Regularizer ^ Learning Schedule ^ Stopping Criterion ^ Activation Function ^ Tokenization ^ Extras ^ | ||
| Line 44: | Line 49: | ||
| | [[https:// | | [[https:// | ||
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ||
| + | |||
| + | ===== Alternative Training Methods ===== | ||
| + | See [[Alternative Training Methods|Neural Networks: Alternative Training Methods]] | ||
| ===== Related Pages ===== | ===== Related Pages ===== | ||
| + | * [[Alternative Training Methods]] | ||
| * [[nlp:Data Preparation]] | * [[nlp:Data Preparation]] | ||
| * [[Hyperparameter Tuning]] | * [[Hyperparameter Tuning]] | ||
ml/nn_training.1646248638.txt.gz · Last modified: 2023/06/15 07:36 (external edit)