====== Compositional Generalization ====== ===== Overviews ===== * [[https://arxiv.org/pdf/2302.01067.pdf|Lin et al 2023 - A Survey on Compositional Generalization in Applications]] Not an NLP paper, and not very comprehensive. WARNING: Missing a bunch of NLP work. ===== Papers ===== * SCAN dataset: [[https://arxiv.org/pdf/1711.00350.pdf|Lake & Baroni 2017 - Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks]] * [[https://arxiv.org/pdf/1807.04640.pdf|Chang et al 2018 - Automatically Composing Representation Transformations as a Means for Generalization]] Has a good section on what compositional generalization is. Positive results for curriculum learning * [[https://arxiv.org/pdf/1906.05381.pdf|Lake 2019 - Compositional generalization through metasequence-to-sequence learning]] * M-PCFGSET dataset: [[https://eliabruni.github.io/publications/hupkes2019compositionality.pdf|Hupkes et al 2019 - The Compositionality of Neural Networks: Integrating Symbolism and Connectionism]] * COGS dataset: [[https://arxiv.org/pdf/2010.05465.pdf|Kim & Linzen 2020 - COGS: A Compositional Generalization Challenge Based on Semantic Interpretation]] * [[https://arxiv.org/pdf/1912.09713.pdf|2019 - Measuring Compositional Generalization: A Comprehensive Method on Realistic Data]] Introduces MCD data split method * [[https://arxiv.org/pdf/2008.06662.pdf|Chen et al 2020 - Compositional Generalization via Neural-Symbolic Stack Machines]] * [[https://arxiv.org/pdf/2009.06040.pdf|Herzig & Berant 2020 - Span-based Semantic Parsing for Compositional Generalization]] * [[https://arxiv.org/pdf/2010.12725.pdf|Shaw et al 2020 - Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?]] [[https://github.com/google-research/language/tree/master/language/nqg|Code and data]] Introduces TMCD split method * Improving Transformers for COG: * **[[https://arxiv.org/pdf/2108.12284.pdf|Csordás et al 2021 - The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers]]** * [[https://arxiv.org/pdf/2108.04378.pdf|Ontañón et al 2021 - Making Transformers Solve Compositional Tasks]] ==== Papers using Curriculum Learning ==== * [[https://arxiv.org/pdf/2006.10627.pdf|2020 - Compositional Generalization by Learning Analytical Expressions]] ==== Prompting and Language Models ==== * [[https://arxiv.org/pdf/2209.15003.pdf|Drozdov 2022 - Compositional Semantic Parsing with Large Language Models]] ==== COG in Machine Translation ==== * [[https://arxiv.org/pdf/2210.06709.pdf|Yin et al 2022 - Categorizing Semantic Representations for Neural Machine Translation]] ==== Non-NLP Papers ==== * [[https://arxiv.org/pdf/2006.09437.pdf|Klinger et al 2020 - A Study of Compositional Generalization in Neural Models]] Has some negative results for curriculum learning ===== Datasets ===== * Semantic Parsing * [[https://github.com/brendenlake/SCAN|SCAN dataset]] [[https://arxiv.org/pdf/1711.00350.pdf|paper]]. Also introduces a machine translation dataset for compositional generalization * [[https://github.com/najoungkim/COGS|COGS dataset]] [[https://arxiv.org/pdf/2010.05465.pdf|paper]] * [[https://yale-lily.github.io/spider|Spider dataset]] [[https://arxiv.org/pdf/1809.08887.pdf|paper]] * CFQ dataset [[https://arxiv.org/pdf/1912.09713.pdf|paper]] * PCFG dataset [[https://arxiv.org/pdf/1908.08351.pdf|paper]] (A string edit operation composition benchmark) * M-PCFGSET dataset [[https://eliabruni.github.io/publications/hupkes2019compositionality.pdf|paper]] * Question Answering * ComQA: Compositional Question-Answering dataset [[https://arxiv.org/pdf/2101.06400.pdf|paper]] * Machine Translation * Small-scale MT dataset from the [[https://arxiv.org/pdf/1711.00350.pdf|SCAN dataset paper]] * CoGnition dataset [[https://arxiv.org/pdf/2105.14802.pdf|paper]] ===== People ===== * [[https://scholar.google.com/citations?user=dnZ8udEAAAAJ&hl=en|Jacob Andreas]] [[https://twitter.com/jacobandreas|Tw]] ===== Related Pages ===== * [[ml:Curriculum Learning]] (often useful for compositional generalization) * [[Experimental Method#Effects of the Random Seed]]