====== Visual Question Answering ======

===== Papers =====
  * [[https://arxiv.org/pdf/2210.02928.pdf|Chen et al 2022 - MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text]]
  * Neural Module Networks
    * [[https://openaccess.thecvf.com/content_cvpr_2016/papers/Andreas_Neural_Module_Networks_CVPR_2016_paper.pdf|Andreas et al 2017 - Neural Module Networks]]
    * [[https://openaccess.thecvf.com/content_ICCV_2017/papers/Hu_Learning_to_Reason_ICCV_2017_paper.pdf|Hu et al 2017 - Learning to Reason: End-to-End Module Networks for Visual Question Answering]]

===== Datasets =====
  * Visual QA: [[https://visualqa.org/]]

===== People =====
  * [[https://scholar.google.com/citations?user=dnZ8udEAAAAJ&hl=en|Jacob Andreas]]
  * [[https://scholar.google.com/citations?user=_bs7PqgAAAAJ&hl=en|Dhruv Batra]]

===== Related Pages ======
  * [[Grounded Language Learning]]
  * [[Image Captioning]]
  * [[Question Answering]]