User Tools

Site Tools


nlp:generation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:generation [2024/08/13 06:23] – [Evaluation] jmflanignlp:generation [2024/08/16 00:57] (current) jmflanig
Line 27: Line 27:
   * [[https://arxiv.org/pdf/2004.04696.pdf|Sallam et al 2020 - BLEURT: Learning Robust Metrics for Text Generation]] (ACL 2020)   * [[https://arxiv.org/pdf/2004.04696.pdf|Sallam et al 2020 - BLEURT: Learning Robust Metrics for Text Generation]] (ACL 2020)
   * [[https://arxiv.org/pdf/2102.01672.pdf|Gehrmann et al 2021 - The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics]] ([[https://gem-benchmark.com/|Website]])   * [[https://arxiv.org/pdf/2102.01672.pdf|Gehrmann et al 2021 - The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics]] ([[https://gem-benchmark.com/|Website]])
-    * [[https://aclanthology.org/2021.tacl-1.87.pdf|Freitag et al 2021 - Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation]]** Uses Multidimensional Quality Metrics (MQM) framework. MT paper, used in [[https://www2.statmt.org/wmt24/metrics-task.html|WMT]]+    * [[https://aclanthology.org/2021.tacl-1.87.pdf|Freitag et al 2021 - Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation]] Uses Multidimensional Quality Metrics (MQM) framework. MT paper, used in [[https://www2.statmt.org/wmt24/metrics-task.html|WMT]]
   * [[https://arxiv.org/pdf/2107.01294.pdf|Dou et al 2021 - Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text]]   * [[https://arxiv.org/pdf/2107.01294.pdf|Dou et al 2021 - Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text]]
   * [[https://arxiv.org/pdf/2406.07935|Ruan et al 2024 - Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation]]   * [[https://arxiv.org/pdf/2406.07935|Ruan et al 2024 - Defining and Detecting Vulnerability in Human Evaluation Guidelines: A Preliminary Study Towards Reliable NLG Evaluation]]
  
 +===== Historical Papers =====
 +Papers from a while ago.
 +
 +  * [[https://aclanthology.org/C10-1012.pdf|Bohnet et al 2010 - Broad Coverage Multilingual Deep Sentence Generation with a Stochastic Multi-Level Realizer]]
  
 ===== Datasets ===== ===== Datasets =====
nlp/generation.1723530194.txt.gz · Last modified: 2024/08/13 06:23 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki