User Tools

Site Tools


ml:ensembling

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
ml:ensembling [2022/08/26 09:56] – [Ensembling] jmflanigml:ensembling [2023/06/15 07:36] (current) – external edit 127.0.0.1
Line 6: Line 6:
 For models trained with cross-entropy, //the standard method of ensembling in NLP is to just average the probabilities of the models at test time and predict using this probability// There are two standard ways to create the different models for ensembling: For models trained with cross-entropy, //the standard method of ensembling in NLP is to just average the probabilities of the models at test time and predict using this probability// There are two standard ways to create the different models for ensembling:
   * **Multirun ensembling**: models come from training runs with different random seeds   * **Multirun ensembling**: models come from training runs with different random seeds
-  * **Checkpoint ensembling**: models come from different checkpoints of a single training run+  * **Checkpoint ensembling**: models come from different checkpoints of a single training run. This has the advantage that a only single training run is needed
  
 See [[https://www.amazon.com/Neural-Machine-Translation-Philipp-Koehn/dp/1108497322|Koehn 2020]] p. 148 or [[http://mt-class.org/jhu/assets/nmt-book.pdf#page=67|pdf here]]. See [[https://www.amazon.com/Neural-Machine-Translation-Philipp-Koehn/dp/1108497322|Koehn 2020]] p. 148 or [[http://mt-class.org/jhu/assets/nmt-book.pdf#page=67|pdf here]].
ml/ensembling.1661507774.txt.gz · Last modified: 2023/06/15 07:36 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki