IVMSP-13.4
MULTI-MODAL LEARNING WITH TEXT MERGING FOR TEXTVQA
Changsheng Xu, Zhenlong Xu, Yifan He, Shuigeng Zhou, Fudan Univeristy, China; Jihong Guan, Tongji University, China
Session:
Multi-modal Analysis and Synthesis
Track:
Image, Video, and Multidimensional Signal Processing
Location:
Gather Area I
Presentation Time:
Mon, 9 May, 22:00 - 22:45 China Time (UTC +8)
Mon, 9 May, 14:00 - 14:45 UTC
Mon, 9 May, 14:00 - 14:45 UTC
Session Chair:
Qiongqiong Wang, Institute for Infocomm Research, A*STAR
Session IVMSP-13
IVMSP-13.1: DEEP VIDEO INPAINTING GUIDED BY AUDIO-VISUAL SELF-SUPERVISION
Kyuyeon Kim, Junsik Jung, Woo Jae Kim, Sung-Eui Yoon, Korea Advanced Institute of Science and Technology, Korea, Republic of
IVMSP-13.2: NAVIGATING AUDIO-VISUAL EVENT DETECTION ACROSS MISMATCHED MODALITIES
Guangwei Li, Xuenan Xu, Mengyue Wu, Kai Yu, Shanghai Jiao Tong University, China
IVMSP-13.3: LOOK, LISTEN AND PAY MORE ATTENTION: FUSING MULTI-MODAL INFORMATION FOR VIDEO VIOLENCE DETECTION
Dong-Lai Wei, Yang Liu, Jing Liu, Xin-Hua Zeng, Fudan University, China; Chen-Geng Liu, The University of Melbourne, China; Xiao-Guang Zhu, Shanghai Jiao Tong University, China
IVMSP-13.4: MULTI-MODAL LEARNING WITH TEXT MERGING FOR TEXTVQA
Changsheng Xu, Zhenlong Xu, Yifan He, Shuigeng Zhou, Fudan Univeristy, China; Jihong Guan, Tongji University, China
IVMSP-13.5: Real-Time Audio-Guided Multi-Face Reenactment
Jiangning Zhang, Xianfang Zeng, Chao Xu, Yong Liu, Zhejiang University, Hangzhou, China, China
IVMSP-13.6: A NOVEL PART FEATURE INTEGRATION AND FUSION METHOD FOR FINE-GRAINED VEHICLE RECOGNITION
Ping Wang, Yijie Cao, Lei Lu, Xi'an Jiaotong University, China