User Tools

Site Tools


nlp:vision_and_language

Vision and Language

This page is about vision and language tasks that are distinct from visual question answering (which only deals with question answering) or grounded language learning (which includes a learning component to the task).

Overviews

Multimodal Foundation Models (Visual Language Models)

Prompting Methods

Multimodal Dialog Agents

Multimodal Pretraining

Bibliographies

People

nlp/vision_and_language.txt · Last modified: 2025/07/03 04:05 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki