====== Named Entity Recognition ====== ===== General NER Papers ===== * [[https://arxiv.org/pdf/1603.01360.pdf|Lample et al 2016 - Neural Architectures for Named Entity Recognition]] * [[https://www.aclweb.org/anthology/2020.emnlp-main.523.pdf|Yamada et al 2020 - LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention]] Sota on CoNLL-2003 * [[https://www.aclweb.org/anthology/2020.conll-1.16.pdf|Reiss et al 2020 - Identifying Incorrect Labels in the CoNLL-2003 Corpus]] [[https://github.com/CODAIT/Identifying-Incorrect-Labels-In-CoNLL-2003|Corrected Dataset]] ===== Low-Resource NER ===== * [[https://www.aclweb.org/anthology/I17-2016.pdf|Cotterell & Duh 2017 - Low-Resource Named Entity Recognition with Cross-Lingual, Character-Level Neural Conditional Random Fields]] * [[https://dl.acm.org/doi/pdf/10.1145/3038912.3052642?casa_token=9wgo8-MEihgAAAAA:Y_rfIt47wu8aD4s9o98ziOrFtii153gj2ScUMQH1VF_iIqpHUAT4fDlfjo7mTl59OHbnOUtga7I|Kejriwal & Szekely 2017 - Information Extraction in Illicit Web Domains]] * [[https://www.aclweb.org/anthology/2020.acl-main.722.pdf|Rijhwani et al 2020 - Soft Gazetteers for Low-Resource Named Entity Recognition]] Might be a candidate for using in Athena * [[https://arxiv.org/pdf/2101.00388.pdf|Yu et al 2021 - A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition]] * [[https://arxiv.org/pdf/2101.10587.pdf|Mohan et al 2021 - Low Resource Recognition and Linking of Biomedical Concepts from a Large Ontology]] * Recent papers: [[https://paperswithcode.com/task/low-resource-named-entity-recognition]] ===== Distant Supervision, etc ===== * [[https://dl.acm.org/doi/pdf/10.1145/3394486.3403149?casa_token=2NE6Vs6FdXEAAAAA:rBpsQLLhEIp0yXLRmoVXDr8JNz0eobLNOjF0DBCndDBNQ3XaH-e4WrmzztVv-xDkmVziYIv1uRQ|Liang et al 2020 - BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision]] * [[https://www.aclweb.org/anthology/K19-1060.pdf|Mayhew et al 2019 - Named Entity Recognition with Partially Annotated Training Data]] ===== NER in Dialog ===== * [[https://arxiv.org/pdf/1805.03784.pdf|Bowden et al 2018 - SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems]] * [[https://www.aclweb.org/anthology/D18-1126.pdf|Mueller & Durrett 2018 - Effective Use of Context in Noisy Entity Linking]] * [[https://arxiv.org/pdf/2005.14408.pdf|Muralidharan et al 2020 - Noise Robust Named Entity Understanding for Voice Assistants]] ===== Metrics ===== * [[https://aclanthology.org/W03-0419.pdf|Sang et al 2003 - Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition]] Gives the metric used in CoNLL-2003 * [[https://aclanthology.org/2023.acl-long.458.pdf|Andrade et al 2023 - Comparative evaluation of boundary-relaxed annotation for Entity Linking performance]] Gives an overview of prior work on boundary-relaxed matching metrics ===== Software ===== * Athena's NER and EL software [[https://github.com/wenzi3241/ner_el|github]] (private) * [[https://arxiv.org/pdf/1806.05626.pdf|Yang & Zhang 2018 - NCRF++: An Open-source Neural Sequence Labeling Toolkit]] [[https://github.com/jiesutd/NCRFpp]] Adwait says it's easy to add gazetteer features to this software ===== Related Pages ===== * [[Information Extraction]]