SLT 2018 • Technical Program • 2018 IEEE Workshop on Spoken Language Technology (SLT) | 18-21 December 2018

My SLT 2018 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Presentation #

Session:

ASR IV

Session Time:

Friday, December 21, 13:30 - 15:30

Presentation Time:

Friday, December 21, 13:30 - 15:30

Presentation:

Poster

Topic:

Speech recognition and synthesis:

Paper Title:

TRANSLITERATION BASED APPROACHES TO IMPROVE CODE-SWITCHED SPEECH RECOGNITION PERFORMANCE

Authors:

Jesse Emond; Google

Bhuvana Ramabhadran; Google

Brian Roark; Google

Pedro Moreno; Google

Min Ma; Google

Abstract:

Code-switching is a commonly occuring phenomenon in many multilingual communities, wherein a speaker switches between languages within a single utterance. Conventional Word Error Rate (WER) is not sufficient for measuring the performance of an Automated Speech Recognition (ASR) system on code-mixed languages due to ambiguities in transcription, misspellings and borrowing of words from two different writing systems. These rendering errors artificially inflate the WER of an ASR system and complicate its evaluation. Furthermore, these errors make it harder to accurately evaluate modeling errors originating from the code-switched language and acoustic models. In this work, we propose the use of a new metric, transliteration-optimized Word Error Rate (toWER) that smoothes out many of these irregularities by mapping all text to one writing system and demonstrate a correlation with the amount of code-switching present in a language. We also present a novel approach to acoustic and language modeling for bilingual code-switched indic languages using the same transliteration approach. We demonstrate the robustness and generality of our proposed approach on state-of-the-art Neural Network based acoustic and language models. We obtain significant gains in ASR performance of up to 10% relative on Google Voice Search and dictation traffic in several Indic languages.