Differences

This shows you the differences between two versions of the page.

--- ml:probabilistic_graphical_models [2021/11/03 17:12] – [Recent NLP Papers that Use PGMs] jmflanig
+++ ml:probabilistic_graphical_models [2021/11/30 07:42] (current) – removed jmflanig
@@ Line 1: / Line 1: @@
-====== Probabilistic Graphical Models (PGMs) ======
-Probabilistic graphical models (PGMs) are sub-area of machine learning and statistics.  PGMs are a framework for representing independence assumptions of random variables in probability distributions.  Broadly, the study of PGMs includes the study of algorithms for learning and inference for these complex probability distributions.  PGMs have applications in machine learning, statistics, natural language processing, speech recognition, computer vision, robotics, and other areas.  Topics include Bayesian Networks, Hidden Markov Models (HMMs), Conditional Random Fields (CRFs), Markov Random Fields (MRFs), Variational Inference and [[bayesian_methods#Bayesian nonparametrics]].
-===== Courses, Tutorials, and Overview Papers =====
-  * Sometimes PGMS are covered in the UCSC course [[https://courses.soe.ucsc.edu/courses/cse290c|CSE 290C]] (when [[https://courses.soe.ucsc.edu/courses/cmps290c/Fall15/01|Lisa Geetoor teaches it]])
-  * **Course at CMU**: Probabilistic Graphical Models [[https://www.cs.cmu.edu/~epxing/Class/10708-20/index.html|Spring 2020]] [[https://www.cs.cmu.edu/~epxing/Class/10708-20/lectures.html|Lectures with videos]] [[https://www.cs.cmu.edu/~epxing/Class/10708/|2014 (with videos and scribe notes)]]
-  * **Matt Gormley's course at CMU**: [[https://www.cs.cmu.edu/~mgormley/courses/10418/|10418]] (with videos)
-  * **Best overview tutorial:** [[https://kuleshov.github.io/cs228-notes/|CS228 Lecture Notes]]
-  * [[https://users.soe.ucsc.edu/~niejiazhong/slides/murphy.pdf|Tutorial on Probabilistic Graphical Models]]
-  * [[https://linqs.soe.ucsc.edu/sites/default/files/papers/koller-book07.pdf|Book Chapter: Graphical Models in a Nutshell]]
-  * Paper: [[paper:A Tutorial on Deep Latent Variable Models of Natural Language]]
-===== Models =====
-  * Bayesian Networks
-  * Markov Random Fields
-    * [[https://www.youtube.com/watch?v=iBQkZdPHlCs|Bert Huang's Video]]
-  * Factor Graphs
-===== Inference =====
-  * Belief Propagation
-    * [[http://helper.ipam.ucla.edu/publications/gss2013/gss2013_11344.pdf|Book chapter]]
-    * [[http://www.cs.cmu.edu/~mgormley/courses/10418/slides/lecture9-bp.pdf|Matt Gormley's slides]]
-    * [[https://www.youtube.com/watch?v=meBWAboEWQk|Bert Huang's Video]]  **Talks about relation of BP and Lagrangian relaxation at the end.**
-  * Markov Chain Monte-Carlo (MCMC)
-  * Variational Inference
-    * [[https://www.youtube.com/watch?v=smfWKhDcaoA|Topic Models: Variational Inference for Latent Dirichlet Allocation (video)]]
-    * [[https://arxiv.org/pdf/1601.00670.pdf|Blei et al 2016 - Variational Inference: A Review for Statisticians]]
-    * Great description here: (see section 2.2) [[https://arxiv.org/pdf/1603.00788.pdf#page=4|Kucukelbir 2016]]
-    * Great video: [[https://www.youtube.com/watch?v=Dv86zdWjJKQ|Blei - Variational Inference: Foundations and Innovations]] (nice overview at ~10:00)
-===== Old Papers =====
-  * [[http://web.cs.iastate.edu/~honavar/factorgraphs.pdf|Kschischang et al 1998 - Factor Graphs and The Sum-Product Algorithm]]  The paper that introduced factor graphs
-  * [[https://www.aclweb.org/anthology/P05-1045.pdf|Finkel et al 2005 - Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling]]
-===== Recent Papers =====
-  * [[https://arxiv.org/pdf/1505.04406.pdf|Bach et al 2017 - Hinge-Loss Markov Random Fields and Probabilistic Soft Logic]]
-  * **[[https://arxiv.org/pdf/2010.12048.pdf|Chiang & Riley 2020 - Factor Graph Grammars]]**  Introduces a new kind of graphical model (factor graph grammars) that are more expressive than plate notation or dynamic graphical models.  It is expressive enough to represent CFG parsing a graphical model.  Very cool.
-===== Interesting NLP Deep Learning + PGM Papers =====
-See also recent advances in [[nlp:HMM|HMMs]] and [[Conditional Random Field|CRFs]].
-  * [[https://arxiv.org/pdf/1702.00887.pdf|Kim et al 2017 - Structured Attention Networks]]
-  * [[https://www.aclweb.org/anthology/K18-1001.pdf|Thai et al 2018 - Embedded-State Latent Conditional Random Fields for Sequence Labeling]]
-  * [[https://arxiv.org/pdf/1906.07880.pdf|Wang et al 2019 - Second-Order Semantic Dependency Parsing with End-to-End Neural Networks]] Uses loopy BP and variational inference
-  * [[https://arxiv.org/pdf/1902.04094.pdf|Wang & Cho 2019 - BERT has a Mouth, and It Must Speak:BERT as a Markov Random Field Language Model]] WARNING: Mistake in this paper, [[https://sites.google.com/site/deepernn/home/blog/amistakeinwangchoberthasamouthanditmustspeakbertasamarkovrandomfieldlanguagemodel|it's not an MRF]]
-  * [[https://www.aclweb.org/anthology/2020.emnlp-main.406.pdf|Gao & Gormley 2020 - Training for Gibbs Sampling on Conditional Random Fields with Neural Scoring Factors]] (basically adapted [[https://www.aclweb.org/anthology/P05-1045.pdf|Finkel et al 2005]] to the neural era)
-  * [[https://arxiv.org/pdf/2002.07233.pdf|Lee et al 2020 - On the Discrepancy between Density Estimation and Sequence Generation]] Uses latent variables for fast non-autoregressive generation
-  * [[http://proceedings.mlr.press/v119/srivastava20a/srivastava20a.pdf|Srivastava et al 2020 - Robustness to Spurious Correlations via Human Annotations]]
-===== Recent NLP Papers that Use PGMs ====
-  * [[https://aclanthology.org/2021.acl-long.346.pdf|Rodriguez et al 2021 - Evaluation Examples Are Not Equally Informative: How Should That Change NLP Leaderboards?]]
-  * **Belief Propagation**
-    * [[https://aclanthology.org/P11-1048.pdf|Auli & Lopez 2011 - A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing]]
-    * [[https://aclanthology.org/W13-4069.pdf|Lee 2013 - Structured Discriminative Model For Dialog State Tracking]]
-    * [[https://aclanthology.org/2020.findings-emnlp.142.pdf|Chen et al 2020 - Neural Dialogue State Tracking with Temporally Expressive Networks]]
-    * [[https://aclanthology.org/P19-1382.pdf|Bjervaet al 2019 - Uncovering Probabilistic Implications in Typological Knowledge Bases]]
-  * **MCMC**
-    * [[https://aclanthology.org/D12-1101.pdf|Singh et al 2012 - Monte Carlo MCMC: Efficient Inference by Approximate Sampling]]
-    * [[https://www.aclweb.org/anthology/2020.emnlp-main.406.pdf|Gao & Gormley 2020 - Training for Gibbs Sampling on Conditional Random Fields with Neural Scoring Factors]] (basically adapted [[https://www.aclweb.org/anthology/P05-1045.pdf|Finkel et al 2005]] to the neural era)
-  * **Variational Inference**
-    *
-===== Related Pages =====
-  * [[Bayesian Methods]] Bayesian methods often use techniques from graphical models, such as MCMC and variational inference, as well as representing likelihood and prior as a graphical model
-  * [[Conditional Random Field]]
-  * [[Probabilistic Logic]]