ml:state-space_models
This is an old revision of the document!
Table of Contents
State-Space Models
Overviews
- S4 model: Gu et al 2021 - Efficiently Modeling Long Sequences with Structured State Spaces Good intro to state spaces
- Gu & Dao 2023 - Mamba: Linear-Time Sequence Modeling with Selective State Spaces Nice overview of SSMs
Key Papers
- Mega: Ma et al 2022 - Mega: Moving Average Equipped Gated Attention SOTA on long-range arena benchmark. Combines flash attention with state-space models. Still n-squared runtime however
- Orvieto et al 2023 - Resurrecting Recurrent Neural Networks for Long Sequences Gives a nice history
Papers
People
Related Pages
ml/state-space_models.1714489317.txt.gz · Last modified: 2024/04/30 15:01 by jmflanig