User Tools

Site Tools


nlp:open_problems

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nlp:open_problems [2021/03/01 08:13] jmflanignlp:open_problems [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Open Problems ====== ====== Open Problems ======
 Partial list of open problems in NLP. Partial list of open problems in NLP.
 +
 +==== Explainability ====
 +  * Explainability is an open problem in machine learning and NLP.  See [[Explainability]].
  
 ==== Machine Translation ==== ==== Machine Translation ====
   * [[https://arxiv.org/pdf/1706.03872.pdf|Koehn & Knowles 2017 - Six Challenges for Neural Machine Translation]]   * [[https://arxiv.org/pdf/1706.03872.pdf|Koehn & Knowles 2017 - Six Challenges for Neural Machine Translation]]
 +
 +==== Transformers ====
 +  * Transformers are hard to train ([[https://arxiv.org/pdf/2004.08249.pdf|Liu et al 2020 - Understanding the Difficulty of Training Transformers]]), and often we can't even get them to overfit and just train a long as we can.  This isn't a good situation, and shows there is some issue with the normalization, the initializer, or the optimizer.  Feedforward, CNN, and RNNS had this issue for a long time, and these issues were fixed with Glorot initialization, batch normalization, and layer normalization.  The open problem is: **what are the optimal initialization and normalization procedures for the Transformer?**
 +
 +===== Problems Jeff Thinks are Open Problems =====
 +These ones are Jeff's opinion.
  
 ==== Pre-training ==== ==== Pre-training ====
   * **Are there simpler, faster methods for contextualized representations than pre-training Transformers?** Historically, complex methods have been invented before researchers find simpler ways to do a similar thing.  For example, there were methods for pre-training word embeddings that did not scale well until Tomas Mikolov asked the question "Is there a more efficient way to do this?" and invented the skip-gram model ([[paper:Mikolov 2013 - Efficient Estimation of Word Representations in Vector Space]] and follow-up work).  It is an open question if we are in a similar situation with Transformer models and contextualized pre-training today.   * **Are there simpler, faster methods for contextualized representations than pre-training Transformers?** Historically, complex methods have been invented before researchers find simpler ways to do a similar thing.  For example, there were methods for pre-training word embeddings that did not scale well until Tomas Mikolov asked the question "Is there a more efficient way to do this?" and invented the skip-gram model ([[paper:Mikolov 2013 - Efficient Estimation of Word Representations in Vector Space]] and follow-up work).  It is an open question if we are in a similar situation with Transformer models and contextualized pre-training today.
  
-==== Explainability ==== 
-  * Explainability is an open problem in machine learning and NLP.  See [[explainability]]. 
  
nlp/open_problems.1614586405.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki