User Tools

Site Tools


nlp:evaluation

Evaluation

Natural Language Output

To evaluate natural language output, researchers often use BLEU or human evaluation. For summarization, they often use ROUGE.

See also Generation - Evaluation, Machine Translation - Evaluation, and Dialog - Evaluation.

Papers

Evaluation with Large Language Models

Robust Evaluation

See also Generation - Evaluation, Machine Translation - Evaluation, and Dialog - Evaluation.

nlp/evaluation.txt · Last modified: 2025/11/18 22:24 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki