====== Low-Resource NLP ====== ===== Overviews ===== * [[https://arxiv.org/pdf/2010.12309.pdf|Hedderich et al 2020 - A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios]] ===== Papers ===== * [[https://www.aclweb.org/anthology/N13-1014.pdf|Garrette & Baldridge 2013 - Learning a Part-of-Speech Tagger from Two Hours of Annotation]] * [[https://arxiv.org/pdf/2202.12499.pdf|Wang et al 2022 - PromDA: Prompt-based Data Augmentation for Low-Resource NLU Tasks]] * [[https://aclanthology.org/2022.acl-long.108.pdf|Zhang et al 2022 - How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language]] * AfriBERTa: [[https://aclanthology.org/2021.mrl-1.11.pdf|Ogueji et al 2021 - Small Data? No Problem! Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages]] ===== Datasets ===== * Cherokee * See [[https://www.cs.unc.edu/~shiyue/|Shiyue Zhang's work]] * [[https://github.com/ZhangShiyue/ChrEn|ChrEn dataset]] for MT * African Languages * NER: [[https://github.com/masakhane-io/masakhane-ner/|MasakhaNER]] [[https://arxiv.org/pdf/2103.11811.pdf|Adelani et al 2021 - MasakhaNER: Named entity recognition for African languages]] * Many * Google's **Under-Represented Languages for NLP**: [[https://github.com/google-research/url-nlp|github]] ===== Conferences and Workshops ===== * AfricaNLP Workshop [[https://sites.google.com/view/africanlp-workshop|2021]] ===== Courses and Tutorials ===== * [[https://github.com/neubig/lowresource-nlp-bootcamp-2020|CMU LTI Low Resource NLP Bootcamp 2020]] ===== People ===== * [[https://scholar.google.com/citations?user=_MQ_lNgAAAAJ&hl=en|Shruti Rijhwani]] ===== Related Pages ===== * [[machine_translation#low-resource|Machine Translation - Low-Resource]]