IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-49.3

METRICGAN-U: UNSUPERVISED SPEECH ENHANCEMENT/ DEREVERBERATION BASED ONLY ON NOISY/ REVERBERATED SPEECH

Szu-Wei Fu, Microsoft, United States of America; Cheng Yu, Kuo-Hsuan Hung, Yu Tsao, Academia Sinica, United States of America; Mirco Ravanelli, Mila-Quebec AI Institute, United States of America

Session:
Speech Enhancement and Dereverberation

Track:
Speech and Language Processing

Location:
Gather Area B

Presentation Time:
Wed, 11 May, 20:00 - 20:45 China Time (UTC +8)
Wed, 11 May, 12:00 - 12:45 UTC

Session Chair:
Yu Tsao, Academia Sinica
Presentation
Discussion
Resources
Session SPE-49
SPE-49.1: CONDITIONAL DIFFUSION PROBABILISTIC MODEL FOR SPEECH ENHANCEMENT
Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Carnegie Mellon University, United States of America; Alexander Richard, Facebook, United States of America; Cheng Yu, Yu Tsao, Academia Sinica, Taiwan
SPE-49.2: DEEPFILTERNET: A LOW COMPLEXITY SPEECH ENHANCEMENT FRAMEWORK FOR FULL-BAND AUDIO BASED ON DEEP FILTERING
Hendrik Schröter, Andreas Maier, Friedrich-Alexander University Erlangen-Nuremberg (FAU), Germany; Alberto N. Escalante-B., Tobias Rosenkranz, WS Audiology, Germany
SPE-49.3: METRICGAN-U: UNSUPERVISED SPEECH ENHANCEMENT/ DEREVERBERATION BASED ONLY ON NOISY/ REVERBERATED SPEECH
Szu-Wei Fu, Microsoft, United States of America; Cheng Yu, Kuo-Hsuan Hung, Yu Tsao, Academia Sinica, United States of America; Mirco Ravanelli, Mila-Quebec AI Institute, United States of America
SPE-49.4: UFORMER: A UNET BASED DILATED COMPLEX & REAL DUAL-PATH CONFORMER NETWORK FOR SIMULTANEOUS SPEECH ENHANCEMENT AND DEREVERBERATION
Yihui Fu, Shubo Lv, Yukai Jv, Lei Xie, Audio, Speech and Language Processing Group (ASLP@NPU), Northwestern Polytechnical University, Xi'an, China, China; Yun Liu, Jingdong Li, Dawei Luo, AI Interaction Division, Sogou Inc., Beijing, China, China
SPE-49.5: ATTENUATION OF ACOUSTIC EARLY REFLECTIONS IN TELEVISION STUDIOS USING PRETRAINED SPEECH SYNTHESIS NEURAL NETWORK
Tomer Rosenbaum, Israel Cohen, Technion – Israel Institute of Technology, Israel; Emil Winebrand, Insoundz Ltd., Israel
SPE-49.6: THE EFFECT OF PARTIAL TIME-FREQUENCY MASKING OF THE DIRECT SOUND ON THE PERCEPTION OF REVERBERANT SPEECH
Lior Madmoni, Boaz Rafaely, Ben-Gurion University of the Negev, Israel; Shir Tibor, Israel Nelken, The Hebrew University of Jerusalem, Israel