2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDSPE-56.3
Paper Title AN END-TO-END SPEECH ACCENT RECOGNITION METHOD BASED ON HYBRID CTC/ATTENTION TRANSFORMER ASR
Authors Qiang Gao, Haiwei Wu, Yanqing Sun, Yitao Duan, NetEase Youdao, China
SessionSPE-56: Paralinguistics in Speech
LocationGather.Town
Session Time:Friday, 11 June, 14:00 - 14:45
Presentation Time:Friday, 11 June, 14:00 - 14:45
Presentation Poster
Topic Speech Processing: [SPE-ANLS] Speech Analysis
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract This paper proposes a novel accent recognition system in the framework of a transformer-based end-to-end speech recognition system. To incorporate the pronunciation and linguistic knowledge into the network, we first pre-train an ASR model in a hybrid CTC/attention manner. Then, focusing on accent recognition, we extend the output token list by inserting accent labels to the transcripts and finetune the network parameters with an accented speech dataset. Our work is evaluated on the Interspeech 2020 Accented English Speech Recognition Challenge. Experiments show that our method achieves an accuracy of 72.39% on the test set and 80.98% on the development set, outperforming the baseline system by a very large margin. Our submitted system ranked second in the accent recognition task in the challenge.