Differences

This shows you the differences between two versions of the page.

--- nlp:pretraining [2026/02/20 06:31] – [Key and Early Papers] jmflanig
+++ nlp:pretraining [2026/02/20 06:35] (current) – [Key and Early Papers] jmflanig
@@ Line 12: / Line 12: @@
   * [[https://arxiv.org/pdf/1506.06726|Kiros et al 2015 - Skip-Thought Vectors]]
   * [[https://arxiv.org/pdf/1511.01432.pdf|Dai et al 2015 - Semi-supervised Sequence Learning]]
+  * [[https://arxiv.org/pdf/1705.00108|Peters et al 2017 - Semi-supervised Sequence Tagging with Bidirectional Language Models]]
   * [[https://arxiv.org/pdf/1611.02683.pdf|Ramachandran et al 2017 - Unsupervised Pretraining for Sequence to Sequence Learning]]
   * [[https://arxiv.org/pdf/1802.05365.pdf|Peters et al 2018 - Deep Contextualized Word Representations]]