Differences

This shows you the differences between two versions of the page.

--- ml:infinite_neural_networks [2022/09/02 18:10] – [Neural Tangent Kernel] chrisliu298
+++ ml:infinite_neural_networks [2023/06/15 07:36] (current) – external edit 127.0.0.1
@@ Line 5: / Line 5: @@
   * Neural Tangent Kernel
     * [[https://rajatvd.github.io/NTK/|Understanding the Neural Tangent Kernel (blog post)]]
+    * [[https://lilianweng.github.io/posts/2022-09-08-ntk/|Some Math behind Neural Tangent Kernel (blog post)]]
   * [[https://arxiv.org/pdf/2007.15801.pdf|Lee et al 2020 - Finite Versus Infinite Neural Networks: an Empirical Study]]
@@ Line 30: / Line 31: @@
     * [[https://arxiv.org/pdf/2203.03466.pdf|Yang et al 2022 - Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer]]
   * **[[https://arxiv.org/pdf/2007.15801.pdf|Lee et al 2020 - Finite Versus Infinite Neural Networks: an Empirical Study]]**
+  * [[https://arxiv.org/abs/2010.01092|Liu et al 2020 - On the linearity of large non-linear models: when and why the tangent kernel is constant]]
+  * [[https://arxiv.org/pdf/2012.00152.pdf|Domingos 2020 - Every Model Learned by Gradient Descent Is Approximately a Kernel Machine]]
   * [[https://arxiv.org/pdf/2206.08720.pdf|Novak et al 2022 - Fast Finite Width Neural Tangent Kernel]] [[https://youtu.be/8MWOhYg89fY?t=10984|video]] [[https://github.com/google/neural-tangents|github]] [[https://colab.research.google.com/github/google/neural-tangents/blob/main/notebooks/empirical_ntk_fcn.ipynb|code example]]
   * **[[https://openreview.net/pdf?id=tUMr0Iox8XW|Yang et al 2022 - Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features]]**
-  * [[https://arxiv.org/abs/2010.01092|Liu et al 2020 - On the linearity of large non-linear models: when and why the tangent kernel is constant]]
 ===== Notes =====
 Jeff's thoughts: Although objective functions for training finite neural networks are usually non-convex, for neural networks with an infinite number of hidden units (infinitely wide) they are usually convex (this is because the space of infinite neural networks is linear: any infinite NN is just a linear combination of all possible parameters in the parameter space).