ml:gpu_deep_learning
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ml:gpu_deep_learning [2025/03/25 07:44] – [Memory Reduction Techniques] jmflanig | ml:gpu_deep_learning [2025/07/17 03:25] (current) – [Miscellaneous Transformer & GPU Papers] jmflanig | ||
|---|---|---|---|
| Line 19: | Line 19: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * Ampere Architecture Whitepaper: [[https:// | ||
| + | * Hopper Architecture Whitepaper: [[https:// | ||
| + | * [[https:// | ||
| * Examples of GPU performance analysis | * Examples of GPU performance analysis | ||
| * [[https:// | * [[https:// | ||
| Line 80: | Line 83: | ||
| * [[https:// | * [[https:// | ||
| * Papers | * Papers | ||
| - | * **Gradient Checkpointing**: | + | * **Gradient |
| * Implemented in pytorch in torch.utils.checkpoint: | * Implemented in pytorch in torch.utils.checkpoint: | ||
| - | * A paper the seems to have re-invented this as " | + | * [[https:// |
| * Computing the forward gradient instead of using backprop would allow you to reduce the memory cost of the computation graph (don't need to keep nodes that won't be used later). | * Computing the forward gradient instead of using backprop would allow you to reduce the memory cost of the computation graph (don't need to keep nodes that won't be used later). | ||
| * [[https:// | * [[https:// | ||
| Line 99: | Line 102: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| ===== Customized Implementations on GPUs ===== | ===== Customized Implementations on GPUs ===== | ||
| Line 111: | Line 115: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| ===== Resources ===== | ===== Resources ===== | ||
| Line 131: | Line 137: | ||
| ===== Related Pages ===== | ===== Related Pages ===== | ||
| * [[Distributed Training]] | * [[Distributed Training]] | ||
| + | * [[Efficient NNs]] | ||
| * [[Systems & ML]] | * [[Systems & ML]] | ||
ml/gpu_deep_learning.1742888663.txt.gz · Last modified: 2025/03/25 07:44 by jmflanig