nlp:language_identification

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:language_identification [2021/03/31 22:17] – [Software] jmflanignlp:language_identification [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Language Identification ====== ====== Language Identification ======
 +
 +===== Overviews =====
 +  * [[https://arxiv.org/pdf/1804.08186.pdf|Jauhiainen et al 2018 - Automatic Language Identification in Texts: A Survey]]
  
 ===== Methods and Papers ===== ===== Methods and Papers =====
-  * [[https://arxiv.org/pdf/1909.12940.pdf|Palakodety et al 2020- Hope Speech Detection: A Computational Analysis of the Voice of Peace]] Clustering based on polyglot word embeddings is an easy method for language detection (see section 5.1).+  * [[https://www.aclweb.org/anthology/P12-3005.pdf|Lui & Baldwin 2012 - langid.py: An Off-the-shelf Language Identification Tool]] 
 +  * [[https://arxiv.org/pdf/1909.12940.pdf|Palakodety et al 2020- Hope Speech Detection: A Computational Analysis of the Voice of Peace]] Clustering based on polyglot word embeddings is an easy method for unsupervised language detection (see section 5.1). 
 +  * [[https://www.aclweb.org/anthology/2020.wnut-1.24.pdf|Palakodety & KhudaBukhsh 2020 - Annotation Efficient Language Identification from Weak Labels]]
  
 ===== Software ===== ===== Software =====
Line 8: Line 13:
   * [[https://fasttext.cc/blog/2017/10/02/blog-post.html|FastText Language ID]]   * [[https://fasttext.cc/blog/2017/10/02/blog-post.html|FastText Language ID]]
   * [[https://cloud.google.com/translate/docs/basic/detecting-language|GoogleLangID]]   * [[https://cloud.google.com/translate/docs/basic/detecting-language|GoogleLangID]]
-  * [[https://github.com/saffsd/langid.py|langid.py]]+  * [[https://github.com/saffsd/langid.py|langid.py]]  [[https://www.aclweb.org/anthology/P12-3005.pdf|paper]]
   * [[https://pypi.org/project/langdetect/|langdetect]]   * [[https://pypi.org/project/langdetect/|langdetect]]
   * [[https://spacy.io/universe/project/spacy-langdetect|spaCy langdetect]]   * [[https://spacy.io/universe/project/spacy-langdetect|spaCy langdetect]]
  
 ===== Related Pages ===== ===== Related Pages =====
 +  * [[Code Switching]] 
 +  * [[Data Preparation]]
  
nlp/language_identification.1617229061.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki