IEEE ICASSP 2021 || Toronto, Ontario, Canada || 6-11 June 2021

My ICASSP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper ID

MLSP-31.3

Paper Title

ADAPTIVE RE-BALANCING NETWORK WITH GATE MECHANISM FOR LONG-TAILED VISUAL QUESTION ANSWERING

Authors

Hongyu Chen, Ruifang Liu, Han Fang, Ximing Zhang, Beijing University of Posts and Telecommunications, China

Session

MLSP-31: Recommendation Systems

Location

Gather.Town

Session Time:

Thursday, 10 June, 14:00 - 14:45

Presentation Time:

Thursday, 10 June, 14:00 - 14:45

Presentation

Poster

Topic

Machine Learning for Signal Processing: [MLR-LMM] Learning from multimodal data

IEEE Xplore Open Preview

Click here to view in IEEE Xplore

Abstract

Visual Question Answering (VQA) is a challenging task which requires a fine-grained semantic understanding of visual and textual contents. Existing works focus on better modality representations. However, these methods give little consideration to the long-tailed data distribution in common VQA datasets. The extreme class imbalance causes training bias to behave well in head class, but fail in tail class. Therefore, we propose a unified Adaptive Re-balancing Network (ARN) to take care of classification in both head and tail classes, exhaustively improving performance for VQA. Specifically, two training branches are introduced to perform their own duty iteratively, which learn the universal representations first and then emphasize the tail data progressively by the re-balancing branch with adaptive learning. Meanwhile, contextual information in the question is vital for guiding accurate visual attention. Thus our network is further equipped with a novel gate mechanism to give higher weight to contextual information. The Experimental results on common benchmarks such as VQA-v2 have demonstrated the superiority of our method compared with state of the art.

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

My ICASSP 2021 Schedule

Paper Detail