====== Recurrent Neural Networks ======

===== RNN Variants =====
  * [[https://arxiv.org/pdf/1905.13324.pdf|Zhang et al 2019 - A Lightweight Recurrent Network for Sequence Modeling]]  Related to the Transformer, LRNs are a drop-in replacement to other RNNs, which remove the sequential natural of RNN processing.  It essentially uses a Key-Query-Value attention mechanism instead of the recurrence.

===== Theoretical Properties =====
  * [[https://arxiv.org/pdf/1805.04908.pdf|Weiss et al 2018 - On the Practical Computational Power of Finite Precision RNNs for Language Recognition]] Shows that RNN variants that can count are strictly more expressive than ones that cannot, and verifies this experimentally.
  * [[https://arxiv.org/pdf/2004.08500v1.pdf|Merrill et al 2020 - A Formal Hierarchy of RNN Architectures]] Interesting, but possibly incorrect in practice because the theoretical analysis is for saturated RNNs (related follow-up [[https://arxiv.org/pdf/2010.09697.pdf|here]]).
  * [[https://www.aclweb.org/anthology/2020.emnlp-main.156.pdf|Hewitt et al 2020 - RNNs can generate bounded hierarchical languages with optimal memory]]

=== People ===
  * [[https://scholar.google.com/citations?user=9y9ALCQAAAAJ&hl=en|Michael Hahn]]

===== Converting RNNs to WFSAs =====
See [[WFSA#Converting RNNs to WFSAs]]

===== Related Pages =====
  * [[Seq2seq]]
  * [[ml:State-Space Models]]
  * [[WFSA]]