nlp:data_augmentation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:data_augmentation [2025/03/27 19:13] – [LLM / Prompt-Based Data Augmentation or Generation] jmflanignlp:data_augmentation [2025/05/21 19:53] (current) – [LLM / Prompt-Based Data Augmentation or Generation] jmflanig
Line 15: Line 15:
 Aka **synthetic data generation**.  For evaluation, see [[Evaluation#Evaluation with Large Language Models]]. Aka **synthetic data generation**.  For evaluation, see [[Evaluation#Evaluation with Large Language Models]].
   * **Overviews**   * **Overviews**
 +    * [[https://arxiv.org/pdf/2402.13446|Tan et al 2024 - Large Language Models for Data Annotation and Synthesis: A Survey]]
     * **[[https://aclanthology.org/2024.findings-acl.658.pdf|Long et al 2024 - On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey]]** Really great survey, lots of practical advice     * **[[https://aclanthology.org/2024.findings-acl.658.pdf|Long et al 2024 - On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey]]** Really great survey, lots of practical advice
 +    * [[https://arxiv.org/pdf/2411.04637|2024 - Hands-On Tutorial: Labeling with LLM and Human-in-the-Loop]] [[https://docs.google.com/presentation/d/1vum7or5PqLCE6MbbH2KnJrwJ2uzLMEpc19M6Nzuy_0I/edit?slide=id.p#slide=id.p|slides]] from [[https://toloka.ai/events/toloka-ai-coling-2025-human-w-llm-tutorial|here]]
   * [[https://arxiv.org/pdf/2108.13487.pdf|Wang et al 2021 - Want To Reduce Labeling Cost? GPT-3 Can Help]]   * [[https://arxiv.org/pdf/2108.13487.pdf|Wang et al 2021 - Want To Reduce Labeling Cost? GPT-3 Can Help]]
   * [[https://arxiv.org/pdf/2202.12499.pdf|Wang et al 2022 - PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks]]   * [[https://arxiv.org/pdf/2202.12499.pdf|Wang et al 2022 - PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks]]
Line 25: Line 27:
  
 ===== Related Pages ===== ===== Related Pages =====
 +  * [[Crowdsourcing]]
   * [[Dataset Creation]]   * [[Dataset Creation]]
   * [[ml:Data Augmentation|ML - Data Augmentation]]   * [[ml:Data Augmentation|ML - Data Augmentation]]
   * [[nlp:semantic_parsing#data_augmentation|Semantic Parsing - Data Augmentation]]   * [[nlp:semantic_parsing#data_augmentation|Semantic Parsing - Data Augmentation]]
  
nlp/data_augmentation.1743102788.txt.gz · Last modified: 2025/03/27 19:13 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki