My SLT 2018 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Presentation #	10
Session:	ASR III (End-to-End)
Session Time:	Friday, December 21, 10:00 - 12:00
Presentation Time:	Friday, December 21, 10:00 - 12:00
Presentation:	Poster
Topic:	Speech recognition and synthesis:
Paper Title:	DEEP CONTEXT: END-TO-END CONTEXTUAL SPEECH RECOGNITION
Authors:	Golan Pundak; Google
	Tara N. Sainath; Google
	Rohit Prabhavalkar; Google
	Anjuli Kannan; Google
	Ding Zhao; Google
Abstract:	In automatic speech recognition (ASR) what a user says depends on the particular context she is in. Typically, this context is represented as a set of word n-grams. In this work, we present a novel, all-neural, end-to-end (E2E) ASR sys- tem that utilizes such context. Our approach, which we re- fer to as Contextual Listen, Attend and Spell (CLAS) jointly- optimizes the ASR components along with embeddings of the context n-grams. During inference, the CLAS system can be presented with context phrases which might contain out-of- vocabulary (OOV) terms not seen during training. We com- pare our proposed system to a more traditional contextualiza- tion approach, which performs shallow-fusion between inde- pendently trained LAS and contextual n-gram models during beam search. Across a number of tasks, we find that the pro- posed CLAS system outperforms the baseline method by as much as 68% relative WER, indicating the advantage of joint optimization over individually trained components.