====== Recurrent Neural Networks ====== ===== RNN Variants ===== * [[https://arxiv.org/pdf/1905.13324.pdf|Zhang et al 2019 - A Lightweight Recurrent Network for Sequence Modeling]] Related to the Transformer, LRNs are a drop-in replacement to other RNNs, which remove the sequential natural of RNN processing. It essentially uses a Key-Query-Value attention mechanism instead of the recurrence. ===== Theoretical Properties ===== * [[https://arxiv.org/pdf/1805.04908.pdf|Weiss et al 2018 - On the Practical Computational Power of Finite Precision RNNs for Language Recognition]] Shows that RNN variants that can count are strictly more expressive than ones that cannot, and verifies this experimentally. * [[https://arxiv.org/pdf/2004.08500v1.pdf|Merrill et al 2020 - A Formal Hierarchy of RNN Architectures]] Interesting, but possibly incorrect in practice because the theoretical analysis is for saturated RNNs (related follow-up [[https://arxiv.org/pdf/2010.09697.pdf|here]]). * [[https://www.aclweb.org/anthology/2020.emnlp-main.156.pdf|Hewitt et al 2020 - RNNs can generate bounded hierarchical languages with optimal memory]] === People === * [[https://scholar.google.com/citations?user=9y9ALCQAAAAJ&hl=en|Michael Hahn]] ===== Converting RNNs to WFSAs ===== See [[WFSA#Converting RNNs to WFSAs]] ===== Related Pages ===== * [[Seq2seq]] * [[ml:State-Space Models]] * [[WFSA]]