SPE-49.1
CONDITIONAL DIFFUSION PROBABILISTIC MODEL FOR SPEECH ENHANCEMENT
Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Carnegie Mellon University, United States of America; Alexander Richard, Facebook, United States of America; Cheng Yu, Yu Tsao, Academia Sinica, Taiwan
Session:
Speech Enhancement and Dereverberation
Track:
Speech and Language Processing
Location:
Gather Area B
Presentation Time:
Wed, 11 May, 20:00 - 20:45 China Time (UTC +8)
Wed, 11 May, 12:00 - 12:45 UTC
Wed, 11 May, 12:00 - 12:45 UTC
Session Chair:
Yu Tsao, Academia Sinica
Session SPE-49
SPE-49.1: CONDITIONAL DIFFUSION PROBABILISTIC MODEL FOR SPEECH ENHANCEMENT
Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Carnegie Mellon University, United States of America; Alexander Richard, Facebook, United States of America; Cheng Yu, Yu Tsao, Academia Sinica, Taiwan
SPE-49.2: DEEPFILTERNET: A LOW COMPLEXITY SPEECH ENHANCEMENT FRAMEWORK FOR FULL-BAND AUDIO BASED ON DEEP FILTERING
Hendrik Schröter, Andreas Maier, Friedrich-Alexander University Erlangen-Nuremberg (FAU), Germany; Alberto N. Escalante-B., Tobias Rosenkranz, WS Audiology, Germany
SPE-49.3: METRICGAN-U: UNSUPERVISED SPEECH ENHANCEMENT/ DEREVERBERATION BASED ONLY ON NOISY/ REVERBERATED SPEECH
Szu-Wei Fu, Microsoft, United States of America; Cheng Yu, Kuo-Hsuan Hung, Yu Tsao, Academia Sinica, United States of America; Mirco Ravanelli, Mila-Quebec AI Institute, United States of America
SPE-49.4: UFORMER: A UNET BASED DILATED COMPLEX & REAL DUAL-PATH CONFORMER NETWORK FOR SIMULTANEOUS SPEECH ENHANCEMENT AND DEREVERBERATION
Yihui Fu, Shubo Lv, Yukai Jv, Lei Xie, Audio, Speech and Language Processing Group (ASLP@NPU), Northwestern Polytechnical University, Xi'an, China, China; Yun Liu, Jingdong Li, Dawei Luo, AI Interaction Division, Sogou Inc., Beijing, China, China
SPE-49.5: ATTENUATION OF ACOUSTIC EARLY REFLECTIONS IN TELEVISION STUDIOS USING PRETRAINED SPEECH SYNTHESIS NEURAL NETWORK
Tomer Rosenbaum, Israel Cohen, Technion – Israel Institute of Technology, Israel; Emil Winebrand, Insoundz Ltd., Israel
SPE-49.6: THE EFFECT OF PARTIAL TIME-FREQUENCY MASKING OF THE DIRECT SOUND ON THE PERCEPTION OF REVERBERANT SPEECH
Lior Madmoni, Boaz Rafaely, Ben-Gurion University of the Negev, Israel; Shir Tibor, Israel Nelken, The Hebrew University of Jerusalem, Israel