Differences

This shows you the differences between two versions of the page.

--- ml:theory:regret_bounds [2022/05/09 08:31] – jmflanig
+++ ml:theory:regret_bounds [2023/06/15 07:36] (current) – external edit 127.0.0.1
@@ Line 4: / Line 4: @@
 ==== Surveys and Theses ====
   * [[http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=9AF556097D53F9170E8DC85C381F6971?doi=10.1.1.161.9973&rep=rep1&type=pdf|Shalev-Shwartz 2007 - Online Learning: Theory, Algorithms, and Applications]]  See section 2.4 (page 27 in pdf) for historical references
+  * [[https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.419.9&rep=rep1&type=pdf|Battou - Online Learning and Stochastic Approximations]]
 ==== Key Papers =====
-  * [[Zinkevich 2003 - Online Convex Programming and Generalized Infinitesimal Gradient Ascent]]
+  * [[https://www.aaai.org/Papers/ICML/2003/ICML03-120.pdf|Zinkevich 2003 - Online Convex Programming and Generalized Infinitesimal Gradient Ascent]] See also the [[https://www.cs.cmu.edu/~maz/publications/techconvex.pdf|CMU tech report]]
 ===== Regret Bounds =====
@@ Line 18: / Line 19: @@
 ===== Related Pages =====
+  * [[ml:theory:Multi-Armed Bandit]]
   * [[ml:Online Learning]]
+  * [[ml:Reinforcement Learning#Theory|Reinforcement Learning - Theory]]