Differences

This shows you the differences between two versions of the page.

--- ml:ensembling [2022/08/26 09:56] – [Ensembling] jmflanig
+++ ml:ensembling [2023/06/15 07:36] (current) – external edit 127.0.0.1
@@ Line 6: / Line 6: @@
 For models trained with cross-entropy, //the standard method of ensembling in NLP is to just average the probabilities of the models at test time and predict using this probability//.  There are two standard ways to create the different models for ensembling:
   * **Multirun ensembling**: models come from training runs with different random seeds
-  * **Checkpoint ensembling**: models come from different checkpoints of a single training run
+  * **Checkpoint ensembling**: models come from different checkpoints of a single training run. This has the advantage that a only single training run is needed
 See [[https://www.amazon.com/Neural-Machine-Translation-Philipp-Koehn/dp/1108497322|Koehn 2020]] p. 148 or [[http://mt-class.org/jhu/assets/nmt-book.pdf#page=67|pdf here]].