Differences

This shows you the differences between two versions of the page.

--- ml:optimization_in_deep_learning [2023/12/01 01:20] – jmflanig
+++ ml:optimization_in_deep_learning [2025/03/25 00:49] (current) – [Effects on Optimization] jmflanig
@@ Line 6: / Line 6: @@
   * Weight normalization
     * Improves the conditioning of the optimization problem ([[https://arxiv.org/pdf/1602.07868.pdf|Salimans & Kingma 2016]])
+  * Lipschitz Constant
+    * [[https://arxiv.org/pdf/2306.09338|Qi et al 2023 - Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant]] Talks about the effect of the Lipschitz constant on optimizing deep neural networks.
 ===== On Global Optimization of Neural Networks =====
@@ Line 41: / Line 43: @@
 ===== Miscellaneous Topics =====
+==== Effect of Skip Connections ====
   * [[https://arxiv.org/pdf/1702.08591.pdf|Balduzzi et al 2017 - The Shattered Gradients Problem: If resnets are the answer, then what is the question?]]
+  * [[https://arxiv.org/pdf/1701.09175.pdf|Orhan & Pitkow 2017 - Skip Connections Eliminate Singularities]]
 ===== People =====