ml:loss_functions
This is an old revision of the document!
Loss Functions
- Cross-entropy (aka log loss, conditional log-likelihood, CRF loss)
- Lots of different ways to write this loss function. One way is minimize $L(\mathcal{D}) = -\sum_{i=1}^{N} log(p(y_i|x_i))$, where $p(y|x) = \frac{e^{score(x,y)}}{\sum_{y} e^{score(x,y)}}$, where $p(y|x) = \frac{e^{score(x,y)}}{\sum_{y} e^{score(x,y)}}$
- The cross-entropy version writes it as $L(\mathcal{D}) = -\sum_{i=1}^{N}\sum_{y} p(y|x_i) log(p_\theta(y|x_i))$, but usually we put in the empirical distribution $p(y|x_i) = I[y=y_i]$ which gives us the log-loss above.
- The minimum of cross-entropy loss does not always exist, and does not exist if the data training data can be completely separated. See for example, section 1.1 of this paper.
- Perceptron loss
- Hinge (SVM) loss
- Softmax margin
- Large-Margin Softmax Loss for Convolutional Neural Networks L-Softmax. Doesn't cite Gimple & Smith. I suspect it may be different, but need to check.
- Ramp loss
- Soft ramp loss
- Infinite ramp loss
- Squared error loss
- Squentropy (Cross-entropy + squared error)
Related Pages
ml/loss_functions.1686814574.txt.gz · Last modified: 2023/06/15 07:36 by 127.0.0.1