User Tools

Site Tools


nlp:vision_and_language

This is an old revision of the document!


Vision and Language

This page is about vision and language tasks that are distinct from visual question answering (which only deals with question answering) or grounded language learning (which includes a learning component to the task).

Overviews

Multimodal Foundation Models (Visual Language Models)

Multimodal Dialog Agents

Multimodal Pretraining

Bibliographies

People

nlp/vision_and_language.1693438710.txt.gz · Last modified: 2023/08/30 23:38 by jmflanig

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki