IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-87.3

IMPROVING NON-AUTOREGRESSIVE END-TO-END SPEECH RECOGNITION WITH PRE-TRAINED ACOUSTIC AND LANGUAGE MODELS

Keqi Deng, Zehui Yang, Gaofeng Cheng, Pengyuan Zhang, Institute of Acoustics, Chinese Academy of Sciences, China; Shinji Watanabe, Carnegie Mellon University, United States of America; Yosuke Higuchi, Waseda University, Japan

Session:
Speech Recognition: Knowledge Transfer and Contextualization

Track:
Speech and Language Processing

Location:
Gather Area C

Presentation Time:
Fri, 13 May, 22:00 - 22:45 China Time (UTC +8)
Fri, 13 May, 14:00 - 14:45 UTC

Session Chair:
Duc Le, Meta
Presentation
Discussion
Resources
Session SPE-87
SPE-87.1: KNOWLEDGE TRANSFER FROM LARGE-SCALE PRETRAINED LANGUAGE MODELS TO END-TO-END SPEECH RECOGNIZERS
Yotaro Kubo, Shigeki Karita, Michiel Bacchiani, Google, Japan
SPE-87.2: IMPROVING CTC-BASED SPEECH RECOGNITION VIA KNOWLEDGE TRANSFERRING FROM PRE-TRAINED LANGUAGE MODELS
Keqi Deng, Gaofeng Cheng, Ji Xu, Pengyuan Zhang, Institute of Acoustics, Chinese Academy of Sciences, China; Songjun Cao, Yike Zhang, Long Ma, Tencent Cloud Xiaowei, China
SPE-87.3: IMPROVING NON-AUTOREGRESSIVE END-TO-END SPEECH RECOGNITION WITH PRE-TRAINED ACOUSTIC AND LANGUAGE MODELS
Keqi Deng, Zehui Yang, Gaofeng Cheng, Pengyuan Zhang, Institute of Acoustics, Chinese Academy of Sciences, China; Shinji Watanabe, Carnegie Mellon University, United States of America; Yosuke Higuchi, Waseda University, Japan
SPE-87.4: KNOWLEDGE DISTILLATION FOR NEURAL TRANSDUCERS FROM LARGE SELF-SUPERVISED PRE-TRAINED MODELS
Xiaoyu Yang, Qiujia Li, Phil Woodland, University of Cambridge, United Kingdom of Great Britain and Northern Ireland
SPE-87.5: IMPROVING END-TO-END CONTEXTUAL SPEECH RECOGNITION WITH FINE-GRAINED CONTEXTUAL KNOWLEDGE SELECTION
Minglun Han, Shiyu Zhou, Bo Xu, Institute of Automation, Chinese Academy of Sciences, China; Linhao Dong, Zhenlin Liang, Meng Cai, Zejun Ma, Bytedance AI Lab, China
SPE-87.6: CONTEXTUAL ADAPTERS FOR PERSONALIZED SPEECH RECOGNITION IN NEURAL TRANSDUCERS
Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Feng-Ju Chang, Jing Liu, Jinru Su, Grant P. Strimel, Athanasios Mouchtaris, Siegfried Kunzmann, Amazon, United States of America