User Tools

Site Tools


ml:hyperparameter_tuning

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ml:hyperparameter_tuning [2022/05/15 00:40] – [Software] jmflanigml:hyperparameter_tuning [2025/03/06 10:20] (current) – [Overviews] jmflanig
Line 3: Line 3:
  
 ===== Overviews ===== ===== Overviews =====
-  * [[https://drive.google.com/uc?export=view&id=1UBPdRsJIy494_Go6KzCKLdHRYLbfMfMF#page=48|Ch 10, p. 320 (p. 48-56 in pdf)]] of [[book:HOML]]+  * [[https://en.wikipedia.org/wiki/Hyperparameter_optimization|Wikipedia Hyperparameter Optimization]] 
 +  * [[https://arxiv.org/pdf/2007.15745|Yang & Shami 2020 - On Hyperparameter Optimization of Machine Learning AlgorithmsTheory and Practice]]
  
 ===== Papers ===== ===== Papers =====
Line 11: Line 12:
   * [[http://proceedings.mlr.press/v70/chen17e/chen17e.pdf|Chen 2017 - Learning to Learn without Gradient Descent by Gradient Descent]] Learns a black-box optimizer (gradient-free optimizer).  Can be applied to hyperparameter tuning.   * [[http://proceedings.mlr.press/v70/chen17e/chen17e.pdf|Chen 2017 - Learning to Learn without Gradient Descent by Gradient Descent]] Learns a black-box optimizer (gradient-free optimizer).  Can be applied to hyperparameter tuning.
   * [[https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf|Golovin et al 2017 - Google Vizier: A Service for Black-Box Optimization]] Was, or still is, "the de facto parameter tuning engine at Google."   * [[https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46180.pdf|Golovin et al 2017 - Google Vizier: A Service for Black-Box Optimization]] Was, or still is, "the de facto parameter tuning engine at Google."
 +  * [[https://arxiv.org/pdf/1703.01785.pdf|Franceschi et al 2021 - Forward and Reverse Gradient-Based Hyperparameter Optimization]] Uses forward gradient for hyperpameter tuning
   * [[https://arxiv.org/pdf/1707.05589.pdf|Melis et al 2017 - On the State of the Art of Evaluation in Neural Language Models]] Uses Google Vizier for large-scale automatic black-box hyperparameter tuning   * [[https://arxiv.org/pdf/1707.05589.pdf|Melis et al 2017 - On the State of the Art of Evaluation in Neural Language Models]] Uses Google Vizier for large-scale automatic black-box hyperparameter tuning
 +  * Asha: **[[https://arxiv.org/pdf/1810.05934.pdf|2018 - A System for Massively Parallel Hyperparameter Tuning]]**. A good method.  Ray-tune has an [[https://docs.ray.io/en/latest/tune/api_docs/schedulers.html|implementation]]
   * **[[https://arxiv.org/pdf/1909.03004.pdf|Dodge et al 2019 - Show Your Work: Improved Reporting of Experimental Result]]**   * **[[https://arxiv.org/pdf/1909.03004.pdf|Dodge et al 2019 - Show Your Work: Improved Reporting of Experimental Result]]**
  
 ===== Software ===== ===== Software =====
-See also list of software in [[https://drive.google.com/uc?export=view&id=1UBPdRsJIy494_Go6KzCKLdHRYLbfMfMF#page=51|HOML Ch 10, p. 322 (p. 51 in pdf)]]. +See also list of software in [[https://drive.google.com/uc?export=view&id=1UBPdRsJIy494_Go6KzCKLdHRYLbfMfMF#page=51|Ch 10, p. 322 (p. 51 in pdf)]] of [[book:HOML]]. 
-  * [[https://docs.ray.io/en/latest/tune/index.html|Ray-Tune]]+  * [[https://docs.ray.io/en/latest/tune/index.html|Ray-Tune]] (for PyTorch) [[https://pytorch.org/tutorials/beginner/hyperparameter_tuning_tutorial.html|tutorial]] 
 +  * [[https://optuna.org/|Optuna]] Nicer interface than Ray-tune 
 +  * Scikit-Optimize (skopt)
  
 ===== Related Pages ===== ===== Related Pages =====
   * [[nlp:Experimental Method]]   * [[nlp:Experimental Method]]
   * [[ml:optimizers#Gradient-Free Optimizers]]   * [[ml:optimizers#Gradient-Free Optimizers]]
 +  * [[Scaling Laws]]
ml/hyperparameter_tuning.1652575254.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki