SPE-69.2
Compressing Transformer-based ASR Model by Task-driven Loss and Attention-based Multi-level Feature Distillation
Yongjie Lv, Longbiao Wang, Meng Ge, Kiyoshi Honda, Tianjin University, China; Sheng Li, Chenchen Ding, National Institute of Information & Communications Technology (NICT), Japan; Lixin Pan, Yuguang Wang, Huiyan Technology (TianJin) Co., LTD., China; Jianwu Dang, Japan Advanced Institute of Science and Technology, Japan
Session:
Attention Mechanism for Speech Models
Track:
Speech and Language Processing
Location:
Gather Area C
Presentation Time:
Thu, 12 May, 21:00 - 21:45 China Time (UTC +8)
Thu, 12 May, 13:00 - 13:45 UTC
Thu, 12 May, 13:00 - 13:45 UTC
Session Chair:
Takaaki Hori, Apple
Session SPE-69
SPE-69.1: LETR: A LIGHTWEIGHT AND EFFICIENT TRANSFORMER FOR KEYWORD SPOTTING
Kevin Ding, Martin Zong, Jiakui Li, Baoxiang Li, SenseTime, China
SPE-69.2: Compressing Transformer-based ASR Model by Task-driven Loss and Attention-based Multi-level Feature Distillation
Yongjie Lv, Longbiao Wang, Meng Ge, Kiyoshi Honda, Tianjin University, China; Sheng Li, Chenchen Ding, National Institute of Information & Communications Technology (NICT), Japan; Lixin Pan, Yuguang Wang, Huiyan Technology (TianJin) Co., LTD., China; Jianwu Dang, Japan Advanced Institute of Science and Technology, Japan
SPE-69.3: SPATIAL PROCESSING FRONT-END FOR DISTANT ASR EXPLOITING SELF-ATTENTION CHANNEL COMBINATOR
Dushyant Sharma, Rong Gong, James Fosburgh, Stanislav Kruchinin, Patrick Naylor, Ljubomir Milanovic, Nuance Communications Inc, United States of America
SPE-69.4: EFFICIENT SEQUENCE TRAINING OF ATTENTION MODELS USING APPROXIMATIVE RECOMBINATION
Nils-Philipp Wynands, Wilfried Michel, Jan Rosendahl, Ralf Schlüter, Hermann Ney, RWTH Aachen University, Germany
SPE-69.5: NEUFA: NEURAL NETWORK BASED END-TO-END FORCED ALIGNMENT WITH BIDIRECTIONAL ATTENTION MECHANISM
Jingbei Li, Yi Meng, Zhiyong Wu, Tsinghua University, China; Helen Meng, The Chinese University of Hong Kong, China; Qiao Tian, Yuping Wang, Yuxuan Wang, ByteDance, China
SPE-69.6: CONFORMER-BASED SPEECH RECOGNITION WITH LINEAR NYSTRÖM ATTENTION AND ROTARY POSITION EMBEDDING
Lahiru Samarakoon, Tsun-Yat Leung, Fano Labs, Hong Kong