Presentation # | 18 |
Session: | ASR IV |
Session Time: | Friday, December 21, 13:30 - 15:30 |
Presentation Time: | Friday, December 21, 13:30 - 15:30 |
Presentation: |
Poster
|
Topic: |
Speech recognition and synthesis: |
Paper Title: |
A STUDY ON SPEECH ENHANCEMENT USING EXPONENT-ONLY FLOATING POINT QUANTIZED NEURAL NETWORK (EOFP-QNN) |
Authors: |
Yi-Te Hsu; Academia Sinica | | |
| Yu-Chen Lin; National Taiwan University | | |
| Szu-Wei Fu; National Taiwan University | | |
| Yu Tsao; Academia Sinica | | |
| Tei-Wei Kuo; National Taiwan University | | |
Abstract: |
Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and exponent-quantization. In the mantissa-quantization stage, EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy in the least mantissa precision. In the exponent-quantization stage, the exponent part of the parameters is further quantized without any additional performance degradation. We evaluated the proposed EOFP quantization technique on two types of neural networks, namely, bidirectional long short-term memory (BLSTM) and fully convolutional neural network (FCN), on a speech enhancement task. Experimental results showed that the model sizes can be significantly reduced (the model sizes of the quantized BLSTM and FCN models were only 18.75% and 21.89%, respectively, compared to those of the original models) while maintaining a satisfactory speech-enhancement performance. |