Ambiguous Question Detection from Visual Scanpath
Description: The primary motivation of this project is to develop a system capable of automatically detecting ambiguous questions to significantly enhance the quality of annotations in visual question-answering datasets. Ambiguity in questions can lead to variability in responses, thereby complicating the task of model training and evaluation. By integrating gaze scanpath data with question content analysis, this project aims to pinpoint characteristics of ambiguity, offering a novel approach to improving data reliability and annotation processes.
Task:- Use the VQA-HMUG dataset for data splitting
- Implement some classification models
- Conduct thorough analysis of the results
Supervisor: Susanne Hindennach and Takumi Nishiyasu
Distribution: 20% Literature Review, 60% Implementation, 20% Analysis and Evaluation
Requirements: Experience with Python, ideally some knowledge of statistics
Literature: [1] Susanne Hindennach, Lei Shi, and Andreas Bulling. 2024. Explaining Disagreement in Visual Question Answering Using Eye Tracking. In 2024 Symposium on Eye Tracking Research and Applications (ETRA ’24), June 4–7, 2024, Glasgow, United Kingdom. ACM, New York, NY, USA, 7 pages. Paper link.
[2] D. Han, J. Choe, S. Chun, J. J. Y. Chung, M. Chang, S. Yun, J. Y. Song, and S. J. Oh. Neglected free lunch – learning image classifiers using annotation byproducts. In International Conference on Computer Vision (ICCV), 2023. Paper link.
[3] Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar, and Andreas Bulling. 2021. VQA-MHUG: A gaze dataset to study multimodal neural attention in VQA. Proceedings of the 2021 ACL SIGNLL Conference on Computational Natural Language Learning (CoNLL).