2021 IEEE International Conference on Acoustics, Speech and Signal Processing

Technical Program

Paper ID	SS-13.5
Paper Title	COMMUNICATION-COST AWARE MICROPHONE SELECTION FOR NEURAL SPEECH ENHANCEMENT WITH AD-HOC MICROPHONE ARRAYS
Authors	Jonah Casebeer, Jamshed Kaikaus, Paris Smaragdis, University of Illinois at Urbana-Champaign, United States
Session	SS-13: Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications
Location	Gather.Town
Session Time:	Thursday, 10 June, 16:30 - 17:15
Presentation Time:	Thursday, 10 June, 16:30 - 17:15
Presentation	Poster
Topic	Special Sessions: Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	In this paper, we present a method for jointly-learning a microphone selection mechanism and a speech enhancement network for multi-channel speech enhancement with an ad-hoc microphone array. The attention-based microphone selection mechanism is trained to reduce communication-costs through a penalty term which represents a task-performance/ communication-cost trade-off. While working within the trade-off, our method can intelligently stream from more microphones in lower SNR scenes and fewer microphones in higher SNR scenes. We evaluate the model in complex echoic acoustic scenes with moving sources and show that it matches the performance of models that stream from a fixed number of microphones while reducing communication costs.