Title
Coarse-to-Fine Visual Question Answering by Iterative, Conditional Refinement
Author
Burghouts, G.J.
Huizinga, W.
Publication year
2022
Abstract
Visual Question Answering (VQA) is a very interesting tech nique to answer natural language questions about an image. Recent methods have focused on incorporating knowledge into an improved VQA model, by augmenting the training set, representing scene graphs, or including reasoning. We also leverage knowledge to make VQA more robust. Yet we take a different route: we take the VQA model as-is and extend it with a novel algorithm called Guided-VQA that guides the questioning by leveraging knowledge to obtain better answers. This enables knowledge-extended VQA while not having to retrain the VQA model. This is beneficial when computing resources and/or time to adapt to new knowledge are limited. We start with the observation that VQA has difficulties with answering compositional and finegrained questions. We propose to solve this by a coarse-to-fine scheme of posing ques tions. The proposed Guided-VQA algorithm is an iterative, conditional refinement that decomposes a compositional, finegrained question into a sequence of coarse-to-fine questions by leveraging taxonomic knowledge about the involved objects. On Visual Genome, we show that it improves the answers significantly over standard VQA. This is relevant for robust deployment of VQA where resources or adaptation time are limited.
Subject
External knowledge
Image analysis
Iterative refinement
Visual Question Answering
Knowledge management
Visual languages
Coarse to fine
External knowledge
Image-analysis
Iterative refinement
Natural language questions
Novel algorithm
Question Answering
Scene-graphs
Training sets
Iterative methods
To reference this document use:
http://resolver.tudelft.nl/uuid:49075619-5020-4428-a871-26a8c2aa44d7
TNO identifier
970942
Publisher
Springer Science and Business Media Deutschland GmbH
ISBN
9783031064296
ISSN
0302-9743
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 21st International Conference on Image Analysis and Processing, ICIAP 2022, 23 May 2022 through 27 May 2022, 418-428
Document type
conference paper