Differences

This shows you the differences between two versions of the page.

--- ml:graphical_models [2021/11/30 11:58] – [Overviews] jmflanig
+++ ml:graphical_models [2025/05/03 01:43] (current) – [Interesting NLP Deep Learning + PGM Papers] jmflanig
@@ Line 4: / Line 4: @@
 ===== Overviews =====
   * Graphical Models
-    * [[https://linqs.soe.ucsc.edu/sites/default/files/papers/koller-book07.pdf|Koller et al 2007 - Graphical Models in a Nutshell]] (book chapter)
+    * [[https://direct.mit.edu/books/edited-volume/3811/chapter-standard/125067/Graphical-Models-in-a-Nutshell|Koller et al 2007 - Graphical Models in a Nutshell]] (book chapter)
   * Deep Latent Variable Models
     * Paper: [[https://arxiv.org/pdf/1812.06834.pdf|Kim et al 2018 - A Tutorial on Deep Latent Variable Models of Natural Language]]
@@ Line 33: / Line 33: @@
   * [[https://arxiv.org/pdf/1505.04406.pdf|Bach et al 2017 - Hinge-Loss Markov Random Fields and Probabilistic Soft Logic]]
   * **[[https://arxiv.org/pdf/2010.12048.pdf|Chiang & Riley 2020 - Factor Graph Grammars]]**  Introduces a new kind of graphical model (factor graph grammars) that are more expressive than plate notation or dynamic graphical models.  It is expressive enough to represent CFG parsing a graphical model.  Very cool.
+  * [[https://www.jmlr.org/papers/volume21/18-856/18-856.pdf|Al-Shedivat et al 2020 - Contextual Explanation Networks]]
 ===== Interesting NLP Deep Learning + PGM Papers =====
 See also recent advances in [[nlp:HMM|HMMs]] and [[Conditional Random Field|CRFs]].
   * [[https://arxiv.org/pdf/1702.00887.pdf|Kim et al 2017 - Structured Attention Networks]]
+  * [[https://aclanthology.org/C18-1142.pdf|Bahuleyan et al 2018 - Variational Attention for Sequence-to-Sequence Models]]
   * [[https://www.aclweb.org/anthology/K18-1001.pdf|Thai et al 2018 - Embedded-State Latent Conditional Random Fields for Sequence Labeling]]
   * [[http://proceedings.mlr.press/v80/kaiser18a/kaiser18a.pdf|Kaiser et al 2018 - Fast Decoding in Sequence Models Using Discrete Latent Variables]]
@@ Line 44: / Line 46: @@
   * [[https://arxiv.org/pdf/2002.07233.pdf|Lee et al 2020 - On the Discrepancy between Density Estimation and Sequence Generation]] Uses latent variables for fast non-autoregressive generation
   * [[http://proceedings.mlr.press/v119/srivastava20a/srivastava20a.pdf|Srivastava et al 2020 - Robustness to Spurious Correlations via Human Annotations]]
+  * [[https://arxiv.org/pdf/2106.02736.pdf|Goyal et al 2021 - Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis–Hastings]]
+  * [[https://arxiv.org/pdf/2105.15021.pdf|Yang et al 2021 - Neural Bi-Lexicalized PCFG Induction]] Uses a Bayesian network to describe their model
+  * [[https://arxiv.org/pdf/2406.06950|Hou et al 2024 - A Probabilistic Framework for LLM Hallucination Detection via Belief Tree Propagation]]
 ===== Recent NLP Papers that Use PGMs ====
@@ Line 53: / Line 57: @@
     * [[https://aclanthology.org/2020.findings-emnlp.142.pdf|Chen et al 2020 - Neural Dialogue State Tracking with Temporally Expressive Networks]]
     * [[https://aclanthology.org/P19-1382.pdf|Bjervaet al 2019 - Uncovering Probabilistic Implications in Typological Knowledge Bases]]
-  * **MCMC**
+  * **MCMC and Sampling**
     * [[https://aclanthology.org/D12-1101.pdf|Singh et al 2012 - Monte Carlo MCMC: Efficient Inference by Approximate Sampling]]
+    * **[[https://aclanthology.org/D18-1405.pdf|Ma & Collins 2018 - Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency]]**
+    * [[https://aclanthology.org/N18-1085.pdf|Lin & Eisner 2018 - Neural Particle Smoothing for Sampling from Conditional Sequence Models]]
+    * [[https://aclanthology.org/2020.aacl-main.21.pdf|Wang et al 2020 - Neural Gibbs Sampling for Joint Event Argument Extraction]]
+    * [[https://aclanthology.org/2020.acl-main.196.pdf|Logan et al 2020 - On Importance Sampling-Based Evaluation of Latent Language Models]]
     * [[https://www.aclweb.org/anthology/2020.emnlp-main.406.pdf|Gao & Gormley 2020 - Training for Gibbs Sampling on Conditional Random Fields with Neural Scoring Factors]] (basically adapted [[https://www.aclweb.org/anthology/P05-1045.pdf|Finkel et al 2005]] to the neural era)
+    * [[https://arxiv.org/pdf/2106.02736.pdf|Goyal & Dyer 2021 - Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis–Hastings]]
   * **Variational Inference**
     * [[https://aclanthology.org/P19-1186.pdf|Lee et al 2019 - Semi-supervised Stochastic Multi-Domain Learning using Variational Inference]]
     * [[https://aclanthology.org/2020.acl-main.367.pdf|Emerson 2020 - Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics]]
+  * **Other Papers**
+    * [[https://arxiv.org/pdf/2306.05836.pdf|Jin et al 2023 - Can Large Language Models Infer Causation from Correlation?]]
 ===== Courses, Tutorials, and Overview Papers =====
@@ Line 64: / Line 75: @@
   * Sometimes PGMS are covered in the UCSC course [[https://courses.soe.ucsc.edu/courses/cse290c|CSE 290C]] (when [[https://courses.soe.ucsc.edu/courses/cmps290c/Fall15/01|Lisa Geetoor teaches it]])
   * **Course at CMU**: Probabilistic Graphical Models [[https://www.cs.cmu.edu/~epxing/Class/10708-20/index.html|Spring 2020]] [[https://www.cs.cmu.edu/~epxing/Class/10708-20/lectures.html|Lectures with videos]] [[https://www.cs.cmu.edu/~epxing/Class/10708/|2014 (with videos and scribe notes)]]
+  * Stanford course: [[https://ermongroup.github.io/cs228/|CS 228 - Probabilistic Graphical Models]]
   * **Matt Gormley's course at CMU**: [[https://www.cs.cmu.edu/~mgormley/courses/10418/|10418]] (with videos)
   * **Best overview tutorial:** [[https://kuleshov.github.io/cs228-notes/|CS228 Lecture Notes]]