====== Neural Network Psychology ======
"Neural network psychology" is the study of what neural networks learn and why they make the predictions they do.  See below for examples.

===== Papers =====
  * [[https://www.aclweb.org/anthology/D16-1248.pdf|Shi et al 2016 - Why Neural Translations are the Right Length]]
  * [[https://www.aclweb.org/anthology/D16-1159.pdf|Shi et al 2016 - Does String-Based Neural MT Learn Source Syntax?]]
  * [[http://www.mitpressjournals.org/doi/pdfplus/10.1162/tacl_a_00115|Linzen et al 2016 - Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies]]
  * [[https://arxiv.org/pdf/1905.06316.pdf|Tenney et al 2019 - What do you learn from context? Probing for sentence structure in contextualized word representations]] Introduces the technique of "edge probing"
  * [[https://arxiv.org/pdf/1905.05950.pdf|Tenney et al 2019 - BERT Rediscovers the Classical NLP Pipeline]] Uses "edge probing"
  * [[https://www.aclweb.org/anthology/P19-1356.pdf|Jawahar et al 2019 - What does BERT learn about the structure of language?]]
  * [[https://helda.helsinki.fi/bitstream/handle/10138/263704/W18_5431_1.pdf?sequence=1|Raganato et al 2018 - An Analysis of Encoder Representations in Transformer-Based Machine Translation]]
  * [[https://arxiv.org/pdf/2310.03686|Langedijk et al 2023 - DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers]]

===== Notes =====
  * I believe Kevin Knight came up with this term

===== Workshops =====
  * [[https://blackboxnlp.github.io/|BlackboxNLP]]

===== Related Pages =====
  * [[nlp:Bert and Friends#Interpretation Bertology|BERTology]]
  * [[nlp:Explainability]]
  * [[ml:Mechanistic Interpretability]]
  * [[nlp:Probing Experiments]]