Paper ID | AUD-6.1 |
Paper Title |
ICASSP 2021 ACOUSTIC ECHO CANCELLATION CHALLENGE: INTEGRATED ADAPTIVE ECHO CANCELLATION WITH TIME ALIGNMENT AND DEEP LEARNING-BASED RESIDUAL ECHO PLUS NOISE SUPPRESSION |
Authors |
Renhua Peng, Linjuan Cheng, Chengshi Zheng, Xiaodong Li, Institute of Acoustics, Chinese Academy of Sciences, China |
Session | AUD-6: Active Noise Control, Echo Reduction, and Feedback Reduction 2: Active Noise Control and Echo Cancellation |
Location | Gather.Town |
Session Time: | Tuesday, 08 June, 16:30 - 17:15 |
Presentation Time: | Tuesday, 08 June, 16:30 - 17:15 |
Presentation |
Poster
|
Topic |
Audio and Acoustic Signal Processing: [AUD-NEFR] Active Noise Control, Echo Reduction and Feedback Reduction |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
This paper describes a three-stage acoustic echo cancellation (AEC) and suppression framework for the ICASSP 2021 AEC-Challenge. In the first stage, a partitioned block frequency domain adaptive filtering is implemented to cancel the linear echo components without introducing the near-end speech distortion, where we estimate and compensate the time delay between the far-end reference signal and the microphone signal beforehand. In the second stage, a deep complex U-Net integrated with gated recurrent unit is proposed to further suppress the residual echo components. Finally, an extremely tiny deep complex U-Net is trained to further suppress environmental noise in the last stage, which can also further increase the echo return loss enhancement (ERLR) without increasing the computational complexity dramatically. Experimental results show that the proposed three-stage framework can get the ERLE over 50 dB in both single-talk and double-talk scenarios, and perceptual evaluation of speech quality can be improved about 0.7 in double-talk scenarios. Subjective results show that the proposed framework outperforms the AEC-Challenge baseline ResRNN by 0.12 points in terms of the MOS. |