ml:nn_tricks
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:nn_tricks [2021/10/09 20:34] – [Neural Network Tricks] jmflanig | ml:nn_tricks [2023/10/11 22:19] (current) – jmflanig | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Neural Network Tricks ====== | ====== Neural Network Tricks ====== | ||
| + | |||
| + | ===== Overviews ===== | ||
| + | * NLP 202 lecture: [[https:// | ||
| * Training Tricks (see [[NN Training]]) | * Training Tricks (see [[NN Training]]) | ||
| Line 9: | Line 12: | ||
| * [[Curriculum Learning]] | * [[Curriculum Learning]] | ||
| * Overcoming [[Catastrophic Forgetting]] | * Overcoming [[Catastrophic Forgetting]] | ||
| - | * Adjust the batch size, or use gradient accumulation to simulate larger batch sizes | + | * Adjust the batch size, or use gradient accumulation |
| - | * Try a different [[optimizers# | + | * Try a different [[optimizers# |
| * Adjust [[https:// | * Adjust [[https:// | ||
| + | * Fine-tuning Specific Tricks | ||
| + | * [[https:// | ||
| * Regularization Tricks (see [[Regularization]]) | * Regularization Tricks (see [[Regularization]]) | ||
| * [[Regularization# | * [[Regularization# | ||
| * [[Ensembling]] | * [[Ensembling]] | ||
| * [[Knowledge Distillation]] (can improve performance by some type of regularization) | * [[Knowledge Distillation]] (can improve performance by some type of regularization) | ||
| + | * [[Regularization# | ||
| * Data Processing Tricks (see [[nlp:Data Preparation]]) | * Data Processing Tricks (see [[nlp:Data Preparation]]) | ||
| * [[nlp: | * [[nlp: | ||
| - | * Shared source and target embeddings | + | * [[https:// |
| * Architecture Tricks (see [[NN Architectures]]) | * Architecture Tricks (see [[NN Architectures]]) | ||
| * Residual connections | * Residual connections | ||
| Line 25: | Line 31: | ||
| * [[nlp: | * [[nlp: | ||
| * Copy mechanism | * Copy mechanism | ||
| - | * Seq2Seq and Generation Tricks | + | * [[nlp: |
| - | * Try different | + | * Try a different |
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| Line 32: | Line 38: | ||
| * Efficiency Tricks | * Efficiency Tricks | ||
| * [[GPU Deep Learning]] | * [[GPU Deep Learning]] | ||
| - | * [[GPU Deep Learning# | + | * [[GPU Deep Learning# |
| * [[Model Compression]] | * [[Model Compression]] | ||
| * Tricks for [[Edge Computing]] | * Tricks for [[Edge Computing]] | ||
ml/nn_tricks.1633811650.txt.gz · Last modified: 2023/06/15 07:36 (external edit)