Technical Program

Paper Detail

Presentation #11
Session:Spoken Language Understanding
Location:Kallirhoe Hall
Session Time:Wednesday, December 19, 10:00 - 12:00
Presentation Time:Wednesday, December 19, 10:00 - 12:00
Presentation: Poster
Topic: Spoken language understanding:
Paper Title: INVESTIGATING THE DOWNSTREAM IMPACT OF GRAPHEME-BASED ACOUSTIC MODELING ON SPOKEN UTTERANCE CLASSIFICATION
Authors: Ryan Price, Bhargav Srinivas Ch, Surbhi Singhal, Srinivas Bangalore, Interactions, LLC., United States
Abstract: Automatic speech recognition (ASR) and natural language understanding are critical components of spoken language understanding (SLU) systems. One obstacle to providing services with SLU systems in multiple languages is the cost associated with acquiring all of the language-specific resources required for ASR in each language. Modeling graphemes eliminates the need to obtain a pronunciation dictionary which maps from speech sounds to words and is one way to reduce ASR resource dependencies when rapidly developing ASR in new languages. However, little is known about the downstream impact on SLU task performance when selecting graphemes as the acoustic modeling unit. This work investigates acoustic modeling for the ASR component of an SLU system using grapheme-based approaches together with convolutional and recurrent neural network architectures. We evaluate both ASR word accuracy and spoken utterance classification (SUC) accuracy for English, Italian and Spanish language tasks and find that it is possible to achieve SUC accuracy that is comparable to conventional phoneme-based systems which leverage a pronunciation dictionary.