IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-46.5

MULTI-CHANNEL END-TO-END NEURAL DIARIZATION WITH DISTRIBUTED MICROPHONES

Shota Horiguchi, Yuki Takashima, Yohei Kawaguchi, Hitachi, Ltd., Japan; Paola Garcia, Johns Hopkins University, United States of America; Shinji Watanabe, Carnegie Mellon University, United States of America

Session:
Multi Speaker and Multi-Channel Speech Recognition and Processing

Track:
Speech and Language Processing

Location:
Gather Area C

Presentation Time:
Tue, 10 May, 23:00 - 23:45 China Time (UTC +8)
Tue, 10 May, 15:00 - 15:45 UTC

Session Chair:
Shinji Watanabe, CMU
Presentation
Discussion
Resources
Session SPE-46
SPE-46.1: Endpoint Detection for Streaming End-to-End Multi-talker ASR
Liang Lu, Otter.ai, United States of America; Jinyu Li, Yifan Gong, Microsoft, USA, United States of America
SPE-46.2: CONTINUOUS STREAMING MULTI-TALKER ASR WITH DUAL-PATH TRANSDUCERS
Desh Raj, Johns Hopkins University, United States of America; Liang Lu, Zhuo Chen, Yashesh Gaur, Jinyu Li, Microsoft Corp., United States of America
SPE-46.3: EXTENDED GRAPH TEMPORAL CLASSIFICATION FOR MULTI-SPEAKER END-TO-END ASR
Xuankai Chang, Shinji Watanabe, Carnegie Mellon University, United States of America; Niko Moritz, Facebook AI, United Kingdom of Great Britain and Northern Ireland; Takaaki Hori, Jonathan Le Roux, Mitsubishi Electric Research Laboratories (MERL), United States of America
SPE-46.4: ADA-VAD: UNPAIRED ADVERSARIAL DOMAIN ADAPTATION FOR NOISE-ROBUST VOICE ACTIVITY DETECTION
Taesoo Kim, Jong Hwan Ko, Sungkyunkwan university, Korea, Republic of; Jiho Chang, Korea Research Institute of Standards and Science, Korea, Republic of
SPE-46.5: MULTI-CHANNEL END-TO-END NEURAL DIARIZATION WITH DISTRIBUTED MICROPHONES
Shota Horiguchi, Yuki Takashima, Yohei Kawaguchi, Hitachi, Ltd., Japan; Paola Garcia, Johns Hopkins University, United States of America; Shinji Watanabe, Carnegie Mellon University, United States of America
SPE-46.6: MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
Naijun Zheng, XunYing Liu, Helen Meng, The Chinese University of Hong Kong, Hong Kong; Na Li, Jianwei Yu, Chao Weng, Dan Su, Tencent AI Lab, Hong Kong