ml:alternative_training_methods
This is an old revision of the document!
Table of Contents
Neural Networks: Alternative Training Methods
Papers
- Such et al 2017 - Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning Training neural networks with genetic algorithms instead of backprop
- Baydin et al 2022 - Gradients without Backpropagation Uses forward mode automatic differentiation to compute a “forward gradient” (no backward pass like backprop). Essentially it computes the change in loss in a random direction. When scaled by the loss, this is an unbiased estimate of the true gradient, which they plug into gradient descent. This has a number of important implications:
- They could have computed the finite differences approximation to the gradient by taking a small step in the random direction. This would allow computing the change in loss for discontinuous functions.
- The direction doesn't have to be sampled from a random normal - the components only need to be independent. They could have sampled the components from {-1,1} (two discrete values). This would allow them to optimize binary neural networks with their technique.
Related Pages
ml/alternative_training_methods.1652588221.txt.gz · Last modified: 2023/06/15 07:36 (external edit)