My SLT 2018 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Presentation #	2
Session:	Corpora and Evaluation Methodologies
Session Time:	Wednesday, December 19, 13:30 - 15:30
Presentation Time:	Wednesday, December 19, 13:30 - 15:30
Presentation:	Poster
Topic:	Spoken language corpora:
Paper Title:	TOWARDS FLUENT TRANSLATIONS FROM DISFLUENT SPEECH
Authors:	Elizabeth Salesky; Carnegie Mellon University
	Susanne Burger; Carnegie Mellon University
	Jan Niehues; Karlsruhe Institute of Technology
	Alex Waibel; Carnegie Mellon University
Abstract:	When translating speech, special consideration for conversational speech phenomena such as disfluencies is necessary. Most machine translation training data consists of well-formed written texts, causing issues when translating spontaneous speech. Previous work has introduced an intermediate step between speech recognition (ASR) and machine translation (MT) to remove disfluencies, making the data better-matched to typical translation text and significantly improving performance. However, with the rise of end-to-end speech translation systems, this intermediate step must be incorporated into the sequence-to-sequence architecture. Further, though translated speech datasets exist, they are typically news or rehearsed speech without many disfluencies (e.g. TED), or the disfluencies are translated into the references (e.g. Fisher). To generate clean translations from disfluent speech, cleaned references are necessary for evaluation. We introduce a corpus of cleaned references for Fisher Spanish-English for this task. We compare how different architectures handle disfluencies, and provide a baseline for removing disfluencies in end-to-end translation.