Presentation # | 1 |
Session: | Detection, Paralinguistics and Coding |
Location: | Kallirhoe Hall |
Session Time: | Wednesday, December 19, 13:30 - 15:30 |
Presentation Time: | Wednesday, December 19, 13:30 - 15:30 |
Presentation: |
Poster
|
Topic: |
Speaker/language recognition: |
Paper Title: |
EXPLORING END-TO-END ATTENTION-BASED NEURAL NETWORKS FOR NATIVE LANGUAGE IDENTIFICATION |
Authors: |
Rutuja Ubale, Yao Qian, Keelan Evanini, Educational Testing Service Research, United States |
Abstract: |
Automatic identification of speakers' native language (L1) based on their speech in a second language (L2) is a challenging research problem that can aid several spoken language technologies such as automatic speech recognition (ASR), speaker recognition, and voice biometrics in interactive voice applications. End-to-end learning, in which the features and the classification model are learned jointly in a single system, is an emerging field in the areas of speech recognition, speaker verification and spoken language understanding. In this paper, we present our study on attention-based end-to-end modeling for native language identification on a database of 11 different L1s. Using this methodology, we can determine the native language of the speaker directly from the raw acoustic features. Experimental results from our study show that our best end-to-end model can achieve promising results by capturing speech commonalities across L1s using an attention mechanism. In addition, fusion of proposed systems with the baseline system leads to significant performance improvements. |