IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-58.3

Tts4pretrain 2.0: Advancing the use of text and speech in ASR pretraining with consistency and contrastive losses

Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Moreno, Gary Wang, Google, United States of America

Session:
Speech Recognition: Acoustic Modeling III

Track:
Speech and Language Processing

Location:
Gather Area C

Presentation Time:
Wed, 11 May, 22:00 - 22:45 China Time (UTC +8)
Wed, 11 May, 14:00 - 14:45 UTC

Session Chair:
Arun Narayanan, Google
Presentation
Discussion
Resources
Session SPE-58
SPE-58.1: Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, ASAPP, United States of America; Kilian Weinberger, Yoav Artzi, ASAPP and Cornell University, United States of America
SPE-58.2: ADVANCING MOMENTUM PSEUDO-LABELING WITH CONFORMER AND INITIALIZATION STRATEGY
Yosuke Higuchi, Waseda University, Mitsubishi Electric Research Laboratories, Japan; Niko Moritz, Facebook AI, United Kingdom of Great Britain and Northern Ireland; Jonathan Le Roux, Takaaki Hori, Mitsubishi Electric Research Laboratories, United States of America
SPE-58.3: Tts4pretrain 2.0: Advancing the use of text and speech in ASR pretraining with consistency and contrastive losses
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Moreno, Gary Wang, Google, United States of America
SPE-58.4: SYNT++: UTILIZING IMPERFECT SYNTHETIC DATA TO IMPROVE SPEECH RECOGNITION
Ting-Yao Hu, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Oncel Tuzel, Apple, United States of America; Mohammadreza Armandpour, Texas A&M University, United States of America
SPE-58.5: Pseudo-Labeling for Massively Multilingual Speech Recognition
Loren Lugosch, McGill University/Mila, Canada; Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert, Facebook AI Research, United States of America
SPE-58.6: DP-DWA: DUAL-PATH DYNAMIC WEIGHT ATTENTION NETWORK WITH STREAMING DFSMN-SAN FOR AUTOMATIC SPEECH RECOGNITION
Dongpeng Ma, Yiwen Wang, Liqiang He, Mingjie Jin, Dan Su, Dong Yu, Tencent, China