User Tools

Site Tools


nlp:machine_translation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:machine_translation [2022/10/07 19:31] – [Multilingual Translation] jmflanignlp:machine_translation [2024/08/13 06:21] (current) – [Evaluation] jmflanig
Line 37: Line 37:
  
 ===== Low-Resource ===== ===== Low-Resource =====
-  * [[https://www.isi.edu/natural-language/mt/emnlp16-transfer.pdf|Zoph et al 2016 - Transfer Learning for Low-Resource Neural Machine Translation]]+  * [[https://arxiv.org/pdf/1604.02201.pdf|Zoph et al 2016 - Transfer Learning for Low-Resource Neural Machine Translation]]
    * [[https://www.aclweb.org/anthology/N18-1032.pdf|Gu et al 2018 - Universal Neural Machine Translation for Extremely Low Resource Languages]]    * [[https://www.aclweb.org/anthology/N18-1032.pdf|Gu et al 2018 - Universal Neural Machine Translation for Extremely Low Resource Languages]]
   * Comparision of SMT vs NMT for low-resource MT   * Comparision of SMT vs NMT for low-resource MT
      * [[https://arxiv.org/pdf/1905.11901.pdf|Sennrich & Zhang 2019 - Revisiting Low-Resource Neural Machine Translation: A Case Study]]      * [[https://arxiv.org/pdf/1905.11901.pdf|Sennrich & Zhang 2019 - Revisiting Low-Resource Neural Machine Translation: A Case Study]]
      * [[https://www.aclweb.org/anthology/2020.lrec-1.325.pdf|Duh et al 2020 - Benchmarking Neural and Statistical Machine Translation on Low-Resource African Languages]]      * [[https://www.aclweb.org/anthology/2020.lrec-1.325.pdf|Duh et al 2020 - Benchmarking Neural and Statistical Machine Translation on Low-Resource African Languages]]
-   * [[https://research.facebook.com/file/585831413174038/No-Language-Left-Behind--Scaling-Human-Centered-Machine-Translation.pdf|Costa-jussà et al 2022 - No Language Left Behind: Scaling Human-Centered Machine Translation]] [[https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/|blog]] [[https://ai.facebook.com/research/no-language-left-behind/|website]] [[https://github.com/facebookresearch/fairseq/tree/nllb/?fbclid=IwAR1dOIBFelfGY48IJe0MgkUhJnqw3SP2y3O4VhlKs5-QM3dXuFRw4HIleZU|model]] [[https://nllb.metademolab.com/|demo]] Transformer encoder-decoder model with sparsely gated mixture of experts. 50B params, and also distilled versions.+   * [[https://research.facebook.com/file/585831413174038/No-Language-Left-Behind--Scaling-Human-Centered-Machine-Translation.pdf|Costa-jussà et al 2022 - No Language Left Behind: Scaling Human-Centered Machine Translation]] [[https://github.com/facebookresearch/flores|dataset]] [[https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/|blog]] [[https://ai.facebook.com/research/no-language-left-behind/|website]] [[https://github.com/facebookresearch/fairseq/tree/nllb/?fbclid=IwAR1dOIBFelfGY48IJe0MgkUhJnqw3SP2y3O4VhlKs5-QM3dXuFRw4HIleZU|model]] [[https://nllb.metademolab.com/|demo]] Transformer encoder-decoder model with sparsely gated mixture of experts. 50B params, and also distilled versions. 
 +  * [[https://aclanthology.org/2022.wmt-1.73.pdf|Marco & Fraser 2022 - Findings of the WMT 2022 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT]]
  
 ===== Character-Level ===== ===== Character-Level =====
Line 55: Line 56:
 ===== Pretraining ===== ===== Pretraining =====
   * [[https://arxiv.org/pdf/1804.06323.pdf|Qi et al 2018 - When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?]]   * [[https://arxiv.org/pdf/1804.06323.pdf|Qi et al 2018 - When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?]]
 +  * [[https://arxiv.org/pdf/2002.06823.pdf|Zhu etal 2020 - Incorporating BERT into Neural Machine Translation]]
   * [[https://arxiv.org/pdf/2008.00401.pdf|Tang et al 2020 - Multilingual Translation with Extensible Multilingual Pretraining and Finetuning]]   * [[https://arxiv.org/pdf/2008.00401.pdf|Tang et al 2020 - Multilingual Translation with Extensible Multilingual Pretraining and Finetuning]]
 +
  
 ===== Unsupervised ===== ===== Unsupervised =====
 +  * [[https://arxiv.org/pdf/1711.00043.pdf|Lample et al 2017- Unsupervised Machine Translation Using Monolingual Corpora Only]]
   * [[https://arxiv.org/pdf/1906.06718.pdf|Luo et al 2019 - Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B]]   * [[https://arxiv.org/pdf/1906.06718.pdf|Luo et al 2019 - Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B]]
 +  * [[https://arxiv.org/pdf/1905.02450.pdf|Song et al 2019 - MASS: Masked Sequence to Sequence Pre-training for Language Generation]] [[https://github.com/microsoft/MASS|github]]
   * [[https://arxiv.org/pdf/2004.05516.pdf|Marchisio et al 2020 - When Does Unsupervised Machine Translation Work?]]   * [[https://arxiv.org/pdf/2004.05516.pdf|Marchisio et al 2020 - When Does Unsupervised Machine Translation Work?]]
-  * [[https://arxiv.org/pdf/2106.15818.pdf|Marchisio et al 2021 - What Can Unsupervised Machine Translation Contribute to High-Resource Language Pairs?]]+  * [[https://arxiv.org/pdf/2106.15818.pdf|Marchisio et al 2021 - On Systematic Style Differences between Unsupervised and Supervised MT and an Application for High-Resource Machine Translation]] 
 +  * [[https://aclanthology.org/2022.wmt-1.73.pdf|Marco & Fraser 2022 - Findings of the WMT 2022 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT]] 
 +  * [[https://arxiv.org/pdf/2310.10385.pdf|Tan & Monz 2023 - Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance]]
  
 ===== Sentence Alignment ===== ===== Sentence Alignment =====
Line 90: Line 97:
   * [[https://arxiv.org/pdf/2004.06063.pdf|Freitag et al 2020 - BLEU might be Guilty but References are not Innocent]]   * [[https://arxiv.org/pdf/2004.06063.pdf|Freitag et al 2020 - BLEU might be Guilty but References are not Innocent]]
   * [[https://arxiv.org/pdf/2106.15195.pdf|Marie et al 2021 - Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers]]   * [[https://arxiv.org/pdf/2106.15195.pdf|Marie et al 2021 - Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers]]
 +  * [[https://arxiv.org/pdf/2310.10482.pdf|Guerreiro et al 2023 - xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection]]
 +  * [[https://arxiv.org/pdf/2302.14520|Kocmi & Federmann 2023 - Large Language Models Are State-of-the-Art Evaluators of Translation Quality]]
 +  * **Evaluation of Metrics**
 +    * **[[https://aclanthology.org/2021.tacl-1.87.pdf|Freitag et al 2021 - Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation]]** Used in [[https://www2.statmt.org/wmt24/metrics-task.html|WMT]]
  
 === BLEU === === BLEU ===
Line 98: Line 109:
  
   * [[https://github.com/mjpost/sacrebleu|SacreBLEU]] (recommended) [[https://arxiv.org/pdf/1804.08771.pdf|paper]]   * [[https://github.com/mjpost/sacrebleu|SacreBLEU]] (recommended) [[https://arxiv.org/pdf/1804.08771.pdf|paper]]
 +    * Internally uses [[https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/mteval-v13a.pl|mteval-v13a.pl]] as the tokenizer
     * If you want to simulate SacreBLEU evaluation, but with statistical significance, you can use the [[https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/mteval-v13a.pl|mteval-v13a.pl]] script to tokenize your output and references, and then use [[https://github.com/jhclark/multeval|MultEval]]     * If you want to simulate SacreBLEU evaluation, but with statistical significance, you can use the [[https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/mteval-v13a.pl|mteval-v13a.pl]] script to tokenize your output and references, and then use [[https://github.com/jhclark/multeval|MultEval]]
   * [[https://github.com/neulab/compare-mt|Compare-MT]] Can analyze the differences between two systems and compute statistical significance. [[https://www.aclweb.org/anthology/N19-4007.pdf|paper]]   * [[https://github.com/neulab/compare-mt|Compare-MT]] Can analyze the differences between two systems and compute statistical significance. [[https://www.aclweb.org/anthology/N19-4007.pdf|paper]]
Line 104: Line 116:
  
 ===== Datasets ===== ===== Datasets =====
 +
 +==== Papers About Corpus Collection ====
 +  * [[https://aclanthology.org/J03-3002.pdf|Resnik & Smith 2003 - The Web as a Parallel Corpus]] - The foundational paper about collecting parallel data from the web.
  
 ==== Standard Datasets ===== ==== Standard Datasets =====
Line 147: Line 162:
 ===== People ===== ===== People =====
   * [[https://scholar.google.com/citations?user=phgBJXYAAAAJ&hl=en|Wilker Aziz]]   * [[https://scholar.google.com/citations?user=phgBJXYAAAAJ&hl=en|Wilker Aziz]]
 +  * [[https://scholar.google.com/citations?user=iPAX6jcAAAAJ&hl=en|Marine Carpuat]]
   * [[https://scholar.google.com/citations?user=dok0514AAAAJ&hl=en|David Chiang]]   * [[https://scholar.google.com/citations?user=dok0514AAAAJ&hl=en|David Chiang]]
   * [[https://scholar.google.com/citations?user=dLaR9lgAAAAJ&hl=en|Orhan Firat]]   * [[https://scholar.google.com/citations?user=dLaR9lgAAAAJ&hl=en|Orhan Firat]]
nlp/machine_translation.1665171065.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki