SS-13: Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications |
Session Type: Poster |
Time: Thursday, 10 June, 16:30 - 17:15 |
Location: Gather.Town |
Virtual Session: View on Virtual Platform |
Session Chairs: Ante Jukić, Apple and Ahmed Abdelaziz, Apple |
SS-13.1: AN EMPIRICAL STUDY OF VISUAL FEATURES FOR DNN BASED AUDIO-VISUAL SPEECH ENHANCEMENT IN MULTI-TALKER ENVIRONMENTS |
Shrishti Saha Shetu; Fraunhofer IIS |
Soumitro Chakrabarty; Fraunhofer IIS |
Emanuël Habets; Fraunhofer IIS |
SS-13.2: ON THE ROLE OF VISUAL CUES IN AUDIOVISUAL SPEECH ENHANCEMENT |
Zakaria Aldeneh; University of Michigan |
Anushree Prasanna Kumar; Apple |
Barry-John Theobald; Apple |
Erik Marchi; Apple |
Sachin Kajarekar; Apple |
Devang Naik; Apple |
Ahmed Hussen Abdelaziz; Apple |
SS-13.3: CONVOLUTIVE TRANSFER FUNCTION INVARIANT SDR TRAINING CRITERIA FOR MULTI-CHANNEL REVERBERANT SPEECH SEPARATION |
Christoph Boeddeker; Paderborn University |
Wangyou Zhang; Shanghai Jiao Tong University |
Tomohiro Nakatani; NTT Corporation |
Keisuke Kinoshita; NTT Corporation |
Tsubasa Ochiai; NTT Corporation |
Marc Delcroix; NTT Corporation |
Naoyuki Kamo; NTT Corporation |
Yanmin Qian; Shanghai Jiao Tong University |
Reinhold Haeb-Umbach; Paderborn University |
SS-13.4: DIRECTIONAL ASR: A NEW PARADIGM FOR E2E MULTI-SPEAKER SPEECH RECOGNITION WITH SOURCE LOCALIZATION |
Aswin Shanmugam Subramanian; Johns Hopkins University |
Chao Weng; Tencent AI Lab |
Shinji Watanabe; Johns Hopkins University |
Meng Yu; Tencent AI Lab |
Yong Xu; Tencent AI Lab |
Shi-Xiong Zhang; Tencent AI Lab |
Dong Yu; Tencent AI Lab |
SS-13.5: COMMUNICATION-COST AWARE MICROPHONE SELECTION FOR NEURAL SPEECH ENHANCEMENT WITH AD-HOC MICROPHONE ARRAYS |
Jonah Casebeer; University of Illinois at Urbana-Champaign |
Jamshed Kaikaus; University of Illinois at Urbana-Champaign |
Paris Smaragdis; University of Illinois at Urbana-Champaign |
SS-13.6: DEEP MULTI-FRAME MVDR FILTERING FOR SINGLE-MICROPHONE SPEECH ENHANCEMENT |
Marvin Tammen; University of Oldenburg |
Simon Doclo; University of Oldenburg |