nlp:human-in-the-loop
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| nlp:human-in-the-loop [2025/05/01 09:44] – [RLHF] jmflanig | nlp:human-in-the-loop [2025/05/31 07:43] (current) – [RLHF] jmflanig | ||
|---|---|---|---|
| Line 35: | Line 35: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| - | * [[https:// | + | * [[https:// |
| - | * [[https:// | + | * [[https:// |
| + | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| Line 44: | Line 45: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| - | * InstructGPT: | + | * InstructGPT: |
| * Used PPO: [[https:// | * Used PPO: [[https:// | ||
| + | * [[https:// | ||
| + | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| === Crowdsourcing & Data Collection === | === Crowdsourcing & Data Collection === | ||
nlp/human-in-the-loop.1746092667.txt.gz · Last modified: 2025/05/01 09:44 by jmflanig