AUD-6.6
TIME-DOMAIN AUDIO-VISUAL SPEECH SEPARATION ON LOW QUALITY VIDEOS
Yifei Wu, Chenda Li, Yanmin Qian, Shanghai Jiao Tong University, China; Jinfeng Bai, Zhongqin Wu, TAL Education Group, China, China
Session:
Audio Security and Multi-Modal Systems
Track:
Audio and Acoustic Signal Processing
Location:
Gather Area K
Presentation Time:
Sun, 8 May, 23:00 - 23:45 China Time (UTC +8)
Sun, 8 May, 15:00 - 15:45 UTC
Sun, 8 May, 15:00 - 15:45 UTC
Session Chair:
Hung-Yi Lee, National Taiwan University
Session AUD-6
AUD-6.1: On Adversarial Robustness of Large-scale Audio Visual Learning
Juncheng B Li, Xinjian Li, Po-Yao (Bernie) Huang, Florian Metze, Carnegie Mellon University, United States of America; Shuhui Qu, Stanford University, United States of America
AUD-6.2: Adversarial sample detection for speaker verification by neural vocoders
Haibin Wu, Po-chun Hsu, Hung-yi Lee, National Taiwan University, China; Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Tencent, China; Zhiyong Wu, Shenzhen International Graduate School, Tsinghua University, China; Helen Meng, The Chinese University of Hong Kong, Hong Kong
AUD-6.3: SOURCE MIXING AND SEPARATION ROBUST AUDIO STEGANOGRAPHY
Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji, Sony Group Corporation, Japan
AUD-6.4: MULTI-MODAL PRE-TRAINING FOR AUTOMATED SPEECH RECOGNITION
David Chan, University of California, Berkeley, United States of America; Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister, Amazon, United States of America
AUD-6.5: SPEAKER-TARGETED AUDIO-VISUAL SPEECH RECOGNITION USING A HYBRID CTC/ATTENTION MODEL WITH INTERFERENCE LOSS
Ryota Tsunoda, Ryoichi Takashima, Tetsuya Takiguchi, Kobe University, Japan; Ryo Aihara, Yoshie Imai, Mitsubishi Electric Corporation, Japan
AUD-6.6: TIME-DOMAIN AUDIO-VISUAL SPEECH SEPARATION ON LOW QUALITY VIDEOS
Yifei Wu, Chenda Li, Yanmin Qian, Shanghai Jiao Tong University, China; Jinfeng Bai, Zhongqin Wu, TAL Education Group, China, China