User Tools

Site Tools


ml:alternative_training_methods

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ml:alternative_training_methods [2022/10/25 21:03] – [Papers] jmflanigml:alternative_training_methods [2023/08/11 20:05] (current) – [Papers] jmflanig
Line 10: Line 10:
     * The direction doesn't have to be sampled from a random normal - the components only need to be independent.  They could have sampled the components from {-1,1} (two discrete values).  This would allow them to optimize [[model_compression#binarized_neural_networks|binary neural networks]] with their technique.     * The direction doesn't have to be sampled from a random normal - the components only need to be independent.  They could have sampled the components from {-1,1} (two discrete values).  This would allow them to optimize [[model_compression#binarized_neural_networks|binary neural networks]] with their technique.
     * Follow-up work: [[https://arxiv.org/pdf/2209.06302.pdf|Belouze 2022 - Optimization without Backpropagation]]     * Follow-up work: [[https://arxiv.org/pdf/2209.06302.pdf|Belouze 2022 - Optimization without Backpropagation]]
 +  * [[https://arxiv.org/pdf/2212.13345.pdf|Hinton 2022 - The Forward-Forward Algorithm: Some Preliminary
 +Investigations]]
 +  * [[https://arxiv.org/pdf/2305.17333.pdf|Malladi et 2023 - Fine-Tuning Language Models with Just Forward Passes]]
  
 ===== Related Pages ===== ===== Related Pages =====
ml/alternative_training_methods.1666731826.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki