My SLT 2018 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Presentation #	8
Session:	Deep Learning for Speech Synthesis
Session Time:	Tuesday, December 18, 14:00 - 17:00
Presentation Time:	Tuesday, December 18, 14:00 - 17:00
Presentation:	Invited talk, Discussion, Oral presentation, Poster session
Topic:	Speech recognition and synthesis:
Paper Title:	MULTI-SCALE ALIGNMENT AND CONTEXTUAL HISTORY FOR ATTENTION MECHANISM IN SEQUENCE-TO-SEQUENCE MODEL
Authors:	Andros Tjandra; Nara Institute of Science and Technology
	Sakriani Sakti; Nara Institute of Science and Technology
	Satoshi Nakamura; Nara Institute of Science and Technology
Abstract:	A sequence-to-sequence model is a neural network module for mapping two sequences of different lengths. The sequence-to-sequence model has three core modules: encoder, decoder, and attention. Attention is the bridge that connects the encoder and decoder modules and improves model performance in many tasks. In this paper, we propose two ideas to improve sequence-to-sequence model performance by enhancing the attention module. First, we maintain the history of the location and the expected context from several previous time-steps. Second, we apply multiscale convolution from several previous attention vectors to the current decoder state. We utilized our proposed framework for sequence-to-sequence speech recognition and text-to-speech systems. The results reveal that our proposed extension can improve performance significantly compared to a standard attention baseline.