See also Language Model - Overviews.
For a history, see section 2.4 of Qiu 2020 or the related work in the GPT-2 paper.
Papers sorted chronologically. For a large list of pre-trained models, see here.
List of popular models in chronological order. See also the list of Large Language Models.
Moved to Fine-Tuning.
Figure from Qiu 2020.
Figure from Liu 2020.
Figure from Qiu 2020.
Figure from Liu 2020. Key:
See also scaling laws.
Papers or projects where people have pretrained LLMs with academic compute budgets.