User Tools

Site Tools


ml:nn_training

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ml:nn_training [2023/06/15 07:36] – external edit 127.0.0.1ml:nn_training [2024/07/09 22:29] (current) – [Topics] jmflanig
Line 23: Line 23:
     * [[nlp:Transformers#Training|Transformer Training Tricks]]     * [[nlp:Transformers#Training|Transformer Training Tricks]]
     * Residual connections, [[https://arxiv.org/pdf/2003.04887.pdf|ReZero]]     * Residual connections, [[https://arxiv.org/pdf/2003.04887.pdf|ReZero]]
 +    * [[https://arxiv.org/pdf/1710.03740|Mixed Precision Training]] (also [[https://docs.nvidia.com/deeplearning/performance/mixed-precision-training/index.html|Train With Mixed Precision - NVIDIA Docs]], see other papers as well)
   * [[Large-Scale]] and [[Distributed Training]]   * [[Large-Scale]] and [[Distributed Training]]
  
ml/nn_training.1686814574.txt.gz · Last modified: 2023/06/15 07:36 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki