====== Dynamic NNs and Conditional Computation ====== Dynamic neural networks use methods such as conditional computation, adaptive computation, dynamic model sparsification, early-exit approaches, etc to build larger models with less compute requirements. ===== Overviews ===== * [[https://arxiv.org/pdf/2102.04906.pdf|Han et al 2021 - Dynamic Neural Networks: A Survey]] ===== Papers ===== * [[https://arxiv.org/pdf/2101.03961.pdf|Fedus et al 2021 - Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity]] Uses conditional computation to scale-up the Transformer. Good overview of conditional computation in the introduction. * [[https://arxiv.org/pdf/2202.01169.pdf|Clark et al 2022 - Unified Scaling Laws for Routed Language Models]] ===== Workshops ===== * [[https://dynn-icml2022.github.io/|DyNN Workshop 2022]] ===== Related Pages ===== * [[Mixture of Expert Models|Mixture of Experts]]