SLT 2018 • Technical Program • 2018 IEEE Workshop on Spoken Language Technology (SLT) | 18-21 December 2018

My SLT 2018 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Presentation #

Session:

ASR I

Session Time:

Wednesday, December 19, 10:00 - 12:00

Presentation Time:

Wednesday, December 19, 10:00 - 12:00

Presentation:

Poster

Topic:

Speech recognition and synthesis:

Paper Title:

HIGH-DEGREE FEATURE FOR DEEP NEURAL NETWORK BASED ACOUSTIC MODEL

Authors:

Hoon Chung; Electronics and Telecommunications Research Institute

Sung Joo Lee; Electronics and Telecommunications Research Institute

Jeon Gue Park; Electronics and Telecommunications Research Institute

Abstract:

In this paper, we propose to use high-degree features using polynomial expansion to improve the discrimination performance of Deep Neural Network (DNN) based acoustic model. Thanks to the success of DNNs for high-dimensional non-linear classification problems, various acoustic information can be represented in high dimensional features, and the non-linear characteristics of speech signal can be robustly generalized in DNN-based acoustic models. Even though it is not clear how DNNs to solve the classification problem, the use of high-dimensional features is based on a well-known knowledge that it helps separability of patters. There is another well-known knowledge that high-degree features increase linear separability of non-linear input features. However, there is little work to exploit high-degree features. Therefore, in this work, we investigate the high-degree features to improve the performance of DNN-based acoustic model further. In this work, the proposed approach was evaluated on a Wall Street Journal (WSJ) speech recognition domain. The proposed method achieved up to 21.8% error reduction rate for the Eval92 test set by reducing the word error rate from 4.82% to 3.77% when using degree-2 polynomial expansion.