IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-L6: Speech Recognition II
Thu, 26 May, 15:30 - 18:00 China Time (UTC +8)
Thu, 26 May, 07:30 - 10:00 UTC
Location: Simpor Junior Ballroom 4811-3
Session Co-Chairs: Lei Wang, KLASS Engineering & Solutions Pte Ltd and Tsendsuren Munkhdalai, Google
Track: Speech and Language Processing

SPE-L6.1: BEING GREEDY DOES NOT HURT: SAMPLING STRATEGIES FOR END-TO-END SPEECH RECOGNITION

Jahn Heymann, Egor Lakomkin, Leif Raedel, Amazon Inc., Germany

SPE-L6.2: TRANSFORMER-BASED STREAMING ASR WITH CUMULATIVE ATTENTION

Mohan Li, Shucong Zhang, Cătălin Zorilă, Rama Doddipatla, Toshiba Cambridge Research Laboratory, Toshiba Europe Ltd, United Kingdom of Great Britain and Northern Ireland

SPE-L6.3: HYBRID RNN-T/ATTENTION-BASED STREAMING ASR WITH TRIGGERED CHUNKWISE ATTENTION AND DUAL INTERNAL LANGUAGE MODEL INTEGRATION

Takafumi Moriya, Takanori Ashihara, Atsushi Ando, Hiroshi Sato, Tomohiro Tanaka, Kohei Matsuura, Ryo Masumura, Marc Delcroix, NTT Corporation, Japan; Takahiro Shinozaki, Tokyo Institute of Technology, Japan

SPE-L6.4: USTED: IMPROVING ASR WITH A UNIFIED SPEECH AND TEXT ENCODER-DECODER

Bolaji Yusuf, Bogazici University, Turkey; Ankur Gandhe, Alex Sokolov, Amazon, United States of America

SPE-L6.5: BILINGUAL END-TO-END ASR WITH BYTE-LEVEL SUBWORDS

Liuhui Deng, Roger Hsiao, Arnab Ghoshal, Apple, United States of America

SPE-L6.6: MULTI-TURN RNN-T FOR STREAMING RECOGNITION OF MULTI-PARTY SPEECH

Ilya Sklyar, Anna Piunova, Amazon, Germany; Xianrui Zheng, University of Cambridge, United Kingdom of Great Britain and Northern Ireland; Yulan Liu, DeepMind, United Kingdom of Great Britain and Northern Ireland

SPE-L6.7: Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition

Tsendsuren Munkhdalai, Khe Chai Sim, Angad Chandorkar, Fan Gao, Mason Chua, Trevor Strohman, ‪Françoise Beaufays, Google, United States of America

SPE-L6.8: GPU-ACCELERATED FORWARD-BACKWARD ALGORITHM WITH APPLICATION TO LATTICE-FREE MMI

Lucas Ondel, Léa-Marie Lam-Yee-Mui, Caio Corro, Laboratoire Interdisciplinaire des Sciences du Numérique, France; Martin Kocour, Lukas Burget, Brno University of Technology, France

SPE-L6.9: RETRIEVING SPEAKER INFORMATION FROM PERSONALIZED ACOUSTIC MODELS FOR SPEECH RECOGNITION

Salima Mdhaffar, Jean-François Bonastre, Natalia Tomashenko, Yannick Estève, LIA - University of Avignon, France; Marc Tommasi, Inria - University of Lille, France

SPE-L6.10: MULTISTREAM NEURAL ARCHITECTURES FOR CUED SPEECH RECOGNITION USING A PRE-TRAINED VISUAL FEATURE EXTRACTOR AND CONSTRAINED CTC DECODING

Sanjana Sankar, Denis Beautemps, Thomas Hueber, Centre National de la Recherche Scientifique, France