Technical Program

Paper Detail

Presentation #18
Session:ASR IV
Location:Kallirhoe Hall
Session Time:Friday, December 21, 13:30 - 15:30
Presentation Time:Friday, December 21, 13:30 - 15:30
Presentation: Poster
Topic: Speech recognition and synthesis:
Paper Title: A STUDY ON SPEECH ENHANCEMENT USING EXPONENT-ONLY FLOATING POINT QUANTIZED NEURAL NETWORK (EOFP-QNN)
Authors: Yi-Te Hsu, Academia Sinica, Taiwan; Yu-Chen Lin, Szu-Wei Fu, National Taiwan University, Taiwan; Yu Tsao, Academia Sinica, Taiwan; Tei-Wei Kuo, National Taiwan University, Taiwan
Abstract: Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and exponent-quantization. In the mantissa-quantization stage, EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy in the least mantissa precision. In the exponent-quantization stage, the exponent part of the parameters is further quantized without any additional performance degradation. We evaluated the proposed EOFP quantization technique on two types of neural networks, namely, bidirectional long short-term memory (BLSTM) and fully convolutional neural network (FCN), on a speech enhancement task. Experimental results showed that the model sizes can be significantly reduced (the model sizes of the quantized BLSTM and FCN models were only 18.75% and 21.89%, respectively, compared to those of the original models) while maintaining a satisfactory speech-enhancement performance.