nlp:evaluation
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| nlp:evaluation [2021/09/08 03:42] – jmflanig | nlp:evaluation [2025/11/18 22:24] (current) – [Evaluation with Large Language Models] jmflanig | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Evaluation ====== | ====== Evaluation ====== | ||
| + | ===== Natural Language Output ===== | ||
| + | To evaluate natural language output, researchers often use BLEU or human evaluation. For summarization, | ||
| + | |||
| + | See also Generation - [[Generation# | ||
| ===== Papers ===== | ===== Papers ===== | ||
| * [[https:// | * [[https:// | ||
| + | |||
| + | |||
| + | ===== Evaluation with Large Language Models ===== | ||
| + | * **Overviews** | ||
| + | * [[https:// | ||
| + | * Blog: [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| + | * **[[https:// | ||
| + | * **[[https:// | ||
| + | * [[https:// | ||
| ===== Robust Evaluation ===== | ===== Robust Evaluation ===== | ||
| * **[[https:// | * **[[https:// | ||
| - | |||
| - | ===== Natural Language Output ===== | ||
| - | To evaluate natural language output, researchers often use BLEU or human evaluation. For summarization, | ||
| See also Generation - [[Generation# | See also Generation - [[Generation# | ||
| + | |||
| + | ===== Related Pages ===== | ||
| + | * [[Experimental Method|Experimental Method and Reproducibility]] | ||
| + | * Natural Language Output | ||
| + | * Generation - [[Generation# | ||
| + | * Machine Translation - [[Machine Translation# | ||
| + | * Dialog - [[Dialog# | ||
| + | * Question Answering - [[Question Answering# | ||
nlp/evaluation.1631072554.txt.gz · Last modified: 2023/06/15 07:36 (external edit)