Domain Adaptation

See Wikipedia - Domain Adaptation. Usually in NLP, domain adaptation becomes necessary if the training data is from a different genre than the testing data - you train on newswire, test on medical domain, for example. Since natural language is so varied, often the domains are quite different, with different lexical items, syntactic patterns, and semantics. For a definition of what a domain is, see van der Wees 2015 - What’s in a Domain? Analyzing Genre and Topic Differences in Statistical Machine Translation or references in Gururangan 2020.

Overviews

2018 - A Survey of Unsupervised Deep Domain Adaptation Unsupervised domain adaptation differs from regular domain adaptation in that you don't get to see labeled examples in the domain of interest, only unlabeled examples.
Chu & Wang 2018 - A Survey of Domain Adaptation for Neural Machine Translation
Ramponi & Plank 2020 - Neural Unsupervised Domain Adaptation in NLP - A Survey

Domain Adaptation (Outside of NLP)

Papers (General Domain Adaptation in NLP)

See also Awesome Neural Adaptation in NLP or here (old) A curated list of unsupervised domain adaption papers in NLP (not including MT).

Daumé III 2009 - Frustratingly Easy Domain Adaptation A seminal paper, the baseline that you should always try (for linear models).
The obvious baseline for neural networks is to fine-tune a pre-trained network on the new domain. I don't know any papers looking into this method and the trade-off associated with it, but there should be one
Ganin et al 2016 - Domain-Adversarial Training of Neural Networks
Kim et al 2016 - Frustratingly Easy Neural Domain Adaptation Not a very good paper. Didn't compare to sensible baselines such as fine-tuning on the new domain.
2017 - Neural Domain Adaptation for Biomedical Question Answering
Ruder & Plank 2018 - Strong Baselines for Neural Semi-Supervised Learning under Domain Shift Proposes multi-task tri-training, which trains one model for tri-training on one task.
Nishida et al 2019 - Unsupervised Domain Adaptation of Language Models for Reading Comprehension LREC paper
Gururangan et al 2020 - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks Note: measures similarity between domains by counting n-gram overlap. Has nice references for what is a domain.
Karouzos et al 2021 - UDALM: Unsupervised Domain Adaptation through Language Modeling NAACL Main paper.
Xu et al 2021 - Gradual Fine-Tuning for Low-Resource Domain Adaptation
Kulshreshtha et al 2021 - Back-Training excels Self-Training at Unsupervised Domain Adaptation of Question Generation and Passage Retrieval Back-training: generating noisy inputs given outputs (vs self-training: generating noisy outputs given inputs).
Chronopoulou et al 2022 - Efficient Hierarchical Domain Adaptation for Pretrained Language Models
Xie et al 2023 - Data Selection for Language Models via Importance Resampling
Malik et al 2023 - UDApter - Efficient Domain Adaptation Using Adapters

Table from Ramponi & Plank 2020.

NLP Wiki

Table of Contents

Domain Adaptation

Overviews

Domain Adaptation (Outside of NLP)

Papers (General Domain Adaptation in NLP)

Domain Adaptation in NLP Tasks

Related Pages