nlp:large_reasoning_models
This is an old revision of the document!
Table of Contents
Large Reasoning Models
o1 or r1-style LLMs, often called “large reasoning models” (LRMs) (see Cuadron 2025)
Overviews
Papers
- OpenAI o1
- Learning to Reason with LLMs Has examples of the full reasoning chains.
- OpenAI o1 System Card arXiv (There is a lot of information to be gleaned about the training process if you read section 2 carefully.)
-
- R1 replication on small datasets
- General papers
- Concise Reasoning
- Using RL
- Models
- Phi-4-Reasoning: Abdin et al 2025 - Phi-4-reasoning Technical Report
Related Pages
nlp/large_reasoning_models.1748503581.txt.gz · Last modified: 2025/05/29 07:26 by jmflanig