Table of Contents
Robustness and Brittleness in NLP (and Deep Learning)
Papers
Conferences, Workshops, and Shared Tasks
People
Related Pages
Robustness and Brittleness in NLP (and Deep Learning)
For an overview, read
Jia 2017
and
Jin 2019
.
Papers
Jia and Liang 2017 - Adversarial Examples for Evaluating Reading Comprehension Systems
Gururangan et al 2018 - Annotation Artifacts in Natural Language Inference Data
Poliak et al 2018 - Hypothesis Only Baselines in Natural Language Inference
Zellers et al 2018 - SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
Introduces Adversarial Filtering to try to eliminate dataset bias
McCoy et al 2019 - Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
Wee and Ng 2019 - Improving the Robustness of Question Answering Systems to Question Paraphrasing
Jin et al 2019 - Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
Srivastava et al 2020 - Robustness to Spurious Correlations via Human Annotations
Tu et al 2020 - An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models
Bartolo et al 2020 - Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension
Applies adversarial filtering to QA
Si et al 2020 - Benchmarking Robustness of Machine Reading Comprehension Models
Ribeiro et al 2020 - Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
Rosenman et al 2020 - Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Shows that deep learning relation extraction systems usually rely on shallow heuristics
Lin et al 2021 - Using Adversarial Attacks to Reveal the Statistical Bias in Machine Reading Comprehension Models
Conferences, Workshops, and Shared Tasks
Ettinger et al 2017 - Towards Linguistically Generalizable NLP Systems: A Workshop and Shared Task
Build It, Break It The Language Edition
EMNLP 2017 Workshop - Building Linguistically Generalizable NLP Systems
People
Hal Daumé III
Yoav Goldberg
Percy Liang
Ellie Pavlick
Related Pages
Dataset Bias (Annotation Artifacts)
Distribution Shift
Robust Evaluation