User Tools

Site Tools


ml:reinforcement_learning

Reinforcement Learning

Overviews

Papers

NLP RL Papers

(Some of the papers above should be moved to this section)

Reinforcement Learning with Verifiable Rewards

DeepSeek-R1-Zero-style reinforcement learning is sometimes called “reinforcement learning (RL) on verifiable rewards” (see for example Zhou 2025) or “RL with outcome supervision.”

See also Large Reasoning Models

Datasets

Theory

Inverse Reinforcement Learning (IRL)

In inverse reinforcement learning (IRL), the agent learns the reward by watching example actions from optimal policies.

Resources

Refer to this page for an up-to-date list of resources.

  • General
    • awesome-rl by dbobrenko is a repository of RL related resources grouped by RL sub-domains.
    • awesome-rl by aikorea is another repository of RL related resources grouped by resource type.
  • Books
  • Papers
    • Key Papers in Deep RL by OpenAI is a list of must-read papers of classic RL algorithms selected by OpenAI researchers.
    • Deep Reinforcement Learning by Yuxi Li is a comprehensive and up-to-date RL survey paper. It can also serve as a tutorial for people who want to have a general understanding of the field.
  • Courses
    • CS285 Deep Reinforcement Learning at UC Berkeley by Professor Sergey Levine is the latest deep RL course. It covers more recent topics and delves deeper into each of them, so it might be difficult for people who are new to RL. [Course website] [Playlist]
    • Introduction to Reinforcement Learning with David Silver by David Silver is an introductory RL course, which can be served as a course for beginners in RL. [Course website] [Playlist]
  • Blogs
    • A (Long) Peek into Reinforcement Learning by Lilian Weng is a good blog post for beginners in RL. For most of the algorithms, it can give you a high-level intuition to help you with further systematic study.
  • Tutorials
    • pytorch-rl by bentrevett is a practical introduction to RL using PyTorch.
    • OpenAI Spinning Up by OpenAI might be the best educational resource to start with in deep RL. It covers key concepts in RL, kinds of RL algorithms, and a tutorial to the policy gradient algorithm. It also provides a resource list and algorithm documentations.
  • Frameworks
    • OpenAI Gym by OpenAI is a toolkit for benchmarking RL algorithms.
  • Miscellaneous

People

ml/reinforcement_learning.txt · Last modified: 2025/07/14 05:40 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki