NLP Wiki paper

paper:a_tutorial_on_deep_latent_variable_models_of_natural_language

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

A Tutorial on Deep Latent Variable Models of Natural Language Kim et al 2019 - A Tutorial on Deep Latent Variable Models of Natural Language

paper:discriminative_training_methods_for_hidden_markov_models_-_theory_and_experiments_with_perceptron_algorithms

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Discriminative Training Methods for Hidden Markov Models - Theory and Experiments with Perceptron Algorithms Discriminative Training Methods for Hidden Markov Models - Theory and Experiments with Perceptron Algorithms

paper:dual_learning_for_machine_translation

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Dual Learning For Machine Translation Dual Learning For Machine Translation TLDR; The authors cast machine translation as a game of a primal task and a dual task, which allows two agents to teach each other in a reinforcement learning fashion (i.e., policy gradient method). With this implementation, the agent only requires monolingual English

paper:etc

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

ETC: Encoding Long and Structured Inputs in Transformers ETC: Encoding Long and Structured Inputs in Transformers TLDR; ETC encodes long inputs using global-local attention and represents structures by combining relative position representations and flexible masking. It also employs CPC pre-training for hierarchical global tokens (structures).

paper:experience_grounds_language

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Experience Grounds Language Experience Grounds Language TLDR; Current models trained only from text corpora might hinder the language understanding research because the supervision from text alone can be limited, so we need to embed social context. The authors proposed the notion of World Scopes (WS), which considers not only the text corpus but also grounding, embodiment, and social interaction.

paper:language_models_are_few-shot_learners

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Language Models are Few-Shot Learners Language Models are Few-Shot Learners

paper:language_models_are_unsupervised_multitask_learners

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Language Models are Unsupervised Multitask Learners Language Models are Unsupervised Multitask Learners * Code: * Open-source clone of the dataset: OpenWebText (from )

paper:megatron-lm_training_multi-billion_parameter_language_models_using_model_parallelism

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism * Follow-up blog post: State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU

paper:mikolov_2013_-_distributed_representations_of_words_and_phrases_and_their_compositionality

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Distributed Representations of Words and Phrases and their Compositionality Distributed Representations of Words and Phrases and their Compositionality

paper:mikolov_2013_-_efficient_estimation_of_word_representations_in_vector_space

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Efficient Estimation of Word Representations in Vector Space Efficient Estimation of Word Representations in Vector Space Follow-up work: Mikolov 2013 - Distributed Representations of Words and Phrases and their Compositionality

paper:schwartz_2018_-_sopa_bridging_cnns_rnns_and_weighted_finite-state_machines

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

SoPa: Bridging CNNs, RNNs, and Weighted Finite-State Machines SoPa: Bridging CNNs, RNNs, and Weighted Finite-State Machines

paper:the_sample_complexity_of_agnostic_learning_under_deterministic_labels

Anonymous (anonymous@undisclosed.example.com) — 2023-06-15T07:36:14+00:00

Paper: The sample complexity of agnostic learning under deterministic labels The sample complexity of agnostic learning under deterministic labels