User Tools

Site Tools


nlp:instruction-tuning

This is an old revision of the document!


Instruction Tuning

Overviews

Papers

RLHF” and that “PPO is unnecessarily complicated for a pre-trained LLM environment.”

Datasets

Models

  • FLAN-T5
  • Alpaca
  • LIMA

People

nlp/instruction-tuning.1746402534.txt.gz · Last modified: 2025/05/04 23:48 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki