====== Dynamic NNs and Conditional Computation ======
Dynamic neural networks use methods such as conditional computation, adaptive computation, dynamic model sparsification, early-exit approaches, etc to build larger models with less compute requirements.

===== Overviews =====
  * [[https://arxiv.org/pdf/2102.04906.pdf|Han et al 2021 - Dynamic Neural Networks: A Survey]]

===== Papers =====
  * [[https://arxiv.org/pdf/2101.03961.pdf|Fedus et al 2021 - Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity]] Uses conditional computation to scale-up the Transformer.  Good overview of conditional computation in the introduction.
  * [[https://arxiv.org/pdf/2202.01169.pdf|Clark et al 2022 - Unified Scaling Laws for Routed Language Models]]

===== Workshops =====
  * [[https://dynn-icml2022.github.io/|DyNN Workshop 2022]]

===== Related Pages =====
  * [[Mixture of Expert Models|Mixture of Experts]]