Technical Program

Paper Detail

Presentation #2
Session:ASR I
Location:Kallirhoe Hall
Session Time:Wednesday, December 19, 10:00 - 12:00
Presentation Time:Wednesday, December 19, 10:00 - 12:00
Presentation: Poster
Topic: Speech recognition and synthesis:
Paper Title: DENSENET BLSTM FOR ACOUSTIC MODELING IN ROBUST ASR
Authors: Maximilian Strake, Pascal Behr, Timo Lohrenz, Tim Fingscheidt, Technische Universität Braunschweig, Germany
Abstract: In recent years, robust automatic speech recognition (ASR) has greatly taken benefit from the use of neural networks for acoustic modeling, although performance still degrades in severe noise conditions. Based on the previous success of models using convolutional and subsequent bidirectional long short-term memory (BLSTM) layers in the same network, we propose to use a densely connected convolutional network (DenseNet) as the first part of such a model, while the second is a BLSTM network. A particular contribution of our work is that we modify the DenseNet topology to become a kind of feature extractor for the subsequent BLSTM network operating on whole speech utterances. We evaluate our model on the 6-channel task of CHiME-4, and are able to consistently outperform a top-performing baseline based on wide residual networks and BLSTMs providing a 2.4% relative WER reduction on the real test set.