User Tools

Site Tools


nlp:lstm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
nlp:lstm [2021/02/18 10:31] – created jmflanignlp:lstm [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== LSTMs ====== ====== LSTMs ======
 +
 +===== Bi-directional LSTMS =====
 +
 +Prior to Transformer models, Bi-LSTMs with max-pooling were a standard baseline model architecture.
 +From [[https://arxiv.org/pdf/1808.08762.pdf|Talman et al 2018 - Sentence Embeddings in NLI with Iterative
 +Refinement Encoders]]:
 +
 +<blockquote>
 +[[https://arxiv.org/pdf/1705.02364.pdf|Conneau et al. (2017)]] explore multiple different sentence embedding architectures
 +ranging from LSTM, BiLSTM and intra-attention to convolution neural networks
 +and the performance of these architectures on NLI tasks. They show that, out of
 +these models, BiLSTM with max pooling achieves the strongest results not only
 +in NLI but also in many other NLP tasks requiring sentence level meaning representations. They also show that their model trained on NLI data achieves strong
 +performance on various transfer learning tasks.
 +</blockquote>
 +
 +With tweaks, they can outperform Transformer models.  See [[https://arxiv.org/pdf/1804.09849.pdf|Chen et al 2018 - The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation]].
  
 ===== Resources ===== ===== Resources =====
   * [[https://people.idsia.ch/~juergen/lstm/|Jeurgen's LSTM Tutorial]]   * [[https://people.idsia.ch/~juergen/lstm/|Jeurgen's LSTM Tutorial]]
nlp/lstm.1613644268.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki