ml:nn_tricks
This is an old revision of the document!
Table of Contents
Neural Network Tricks
- Training Tricks (see NN Training)
- Gradient clipping Pascanu et al 2012
- Regularization Tricks (see Regularization)
- Knowledge Distillation (can improve performance by some type of regularization)
- Data Processing Tricks (see Data Preparation)
- Subword Units (BPE, wordpiece, subword regularization, BPE dropout. Shared source and target vocabulary for subword units.)
- Shared source and target embeddings
- Architecture Tricks (see NN Architectures)
- Residual connections
- Weight sharing
- Efficiency Tricks
- Tricks for Edge Computing
Older NN Tricks
Related Pages
ml/nn_tricks.1615487175.txt.gz · Last modified: 2023/06/15 07:36 (external edit)