nlp:vision_and_language
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| nlp:vision_and_language [2025/06/05 04:38] – [Multimodal Foundation Models (Visual Language Models)] jmflanig | nlp:vision_and_language [2025/07/03 04:05] (current) – [Overviews] jmflanig | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| * **Multimodal Large Language Models (MLLMs)** | * **Multimodal Large Language Models (MLLMs)** | ||
| * [[https:// | * [[https:// | ||
| + | * [[https:// | ||
| + | * For Visual QA: | ||
| + | * [[https:// | ||
| + | * Evaluation of MLLMs: | ||
| + | * [[https:// | ||
| ===== Multimodal Foundation Models (Visual Language Models) ===== | ===== Multimodal Foundation Models (Visual Language Models) ===== | ||
| Line 18: | Line 23: | ||
| * [[https:// | * [[https:// | ||
| * [[https:// | * [[https:// | ||
| + | |||
| + | ==== Prompting Methods ==== | ||
| + | * [[https:// | ||
| ===== Multimodal Dialog Agents ===== | ===== Multimodal Dialog Agents ===== | ||
nlp/vision_and_language.1749098333.txt.gz · Last modified: 2025/06/05 04:38 by jmflanig