Presentation # | 2 |
Session: | ASR I |
Session Time: | Wednesday, December 19, 10:00 - 12:00 |
Presentation Time: | Wednesday, December 19, 10:00 - 12:00 |
Presentation: |
Poster
|
Topic: |
Speech recognition and synthesis: |
Paper Title: |
DENSENET BLSTM FOR ACOUSTIC MODELING IN ROBUST ASR |
Authors: |
Maximilian Strake; Technische Universität Braunschweig | | |
| Pascal Behr; Technische Universität Braunschweig | | |
| Timo Lohrenz; Technische Universität Braunschweig | | |
| Tim Fingscheidt; Technische Universität Braunschweig | | |
Abstract: |
In recent years, robust automatic speech recognition (ASR) has greatly taken benefit from the use of neural networks for acoustic modeling, although performance still degrades in severe noise conditions. Based on the previous success of models using convolutional and subsequent bidirectional long short-term memory (BLSTM) layers in the same network, we propose to use a densely connected convolutional network (DenseNet) as the first part of such a model, while the second is a BLSTM network. A particular contribution of our work is that we modify the DenseNet topology to become a kind of feature extractor for the subsequent BLSTM network operating on whole speech utterances. We evaluate our model on the 6-channel task of CHiME-4, and are able to consistently outperform a top-performing baseline based on wide residual networks and BLSTMs providing a 2.4% relative WER reduction on the real test set. |