Table of Contents
Optimization
Lectures and Books
Theory
Papers
Courses
Blog Posts
People
Related Pages
Optimization
Lectures and Books
Duchi - Introductory Lectures on Stochastic Optimization
Bottou et al 2016 - Optimization Methods for Large-Scale Machine Learning
Survey paper that includes theory results with proofs (even includes the rate of convergence of SGD on non-convex objectives)
Black-box Optimization
Boyd & Vandenberghe - Convex Optimization
Luenberger - Linear and Nonlinear Programming
Great book
Theory
Agarwal, Leon Bottou 2015 - A Lower Bound for the Optimization of Finite Sums
Ghadimi & Lan 2013 - Stochastic First- and Zeroth-order Methods for Nonconvex Stochastic Programming
Cited by Jin 2021 for Theorem 2.10
Jin et al 2021 - On Nonconvex Optimization for Machine Learning: Gradients, Stochasticity, and Saddle Points
Overview of results
Papers
2018 - Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
2015 - Cyclical Learning Rates for Training Neural Networks
2020 - On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
Mei et al 2023 - Stochastic Gradient Succeeds for Bandits
Courses
See also
Courses
and
Seminars
.
Large-Scale Optimization for Data Science @ Princeton
Mathematical Data Science Reading Group @ Princeton
EPFL Course - Optimization for Machine Learning
Convex Optimization and Approximation @ Berkeley
Blog Posts
2018 - Setting the learning rate of your neural network
2017 - The Two Phases of Gradient Descent in Deep Learning
Optimization For Training Deep Models Part I
People
Dale Schuurmans
Related Pages
Application: Optimization
NN Training
Optimizers
Optimization in Deep Learning