Table of Contents

Normalization

Normalization can improve the optimizer's ability to train a neural network. There are two main categories of normalization procedures: activation normalization and weight normalization (Shen 2020).

Overviews

Activation Normalization Schemes

Batch Normalization

Batch normalization is popular in computer vision, but not usually used in NLP because it doesn't work well. Layer normalization is usually used instead (see Shen 2020).

Layer Normalization

Weight Normalization Schemes

Weight Normalization

Other or Uncategorized Schemes