nlp:domain_adaptation
This is an old revision of the document!
Table of Contents
Domain Adaptation
See Wikipedia - Domain Adaptation. Usually in NLP, domain adaptation becomes necessary if the training data is from a different genre than the testing data - you train on newswire, test on medical domain, for example. Since natural language is so varied, often the domains are quite different, with different lexical items, syntactic patterns, and semantics. For a definition of what a domain is, see van der Wees 2015 - What’s in a Domain? Analyzing Genre and Topic Differences in Statistical Machine Translation or references in Gururangan 2020.
Overviews
- 2018 - A Survey of Unsupervised Deep Domain Adaptation Unsupervised domain adaptation differs from regular domain adaptation in that you don't get to see labeled examples in the domain of interest, only unlabeled examples.
Papers (General Domain Adaptation in NLP)
See also Awesome Neural Adaptation in NLP A curated list of unsupervised domain adaption papers in NLP (not including MT).
- Daumé III 2009 - Frustratingly Easy Domain Adaptation A seminal paper, the baseline that you should always try (for linear models).
- The obvious baseline for neural networks is to fine-tune a pre-trained network on the new domain. I don't know any papers looking into this method and the trade-off associated with it, but there should be one
- Kim et al 2016 - Frustratingly Easy Neural Domain Adaptation Not a very good paper. Didn't compare to sensible baselines such as fine-tuning on the new domain.
- Gururangan et al 2020 - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks Note: measures similarity between domains by counting n-gram overlap. Has nice references for what is a domain.
Domain Adaptation in NLP Tasks
Related Pages
nlp/domain_adaptation.1643850323.txt.gz · Last modified: 2023/06/15 07:36 (external edit)