SPE-72.4
ASR-AWARE END-TO-END NEURAL DIARIZATION
Aparna Khare, Eunjung Han, Yuguang Yang, Andreas Stolcke, Amazon, United States of America
Session:
Speaker Diarization I
Track:
Speech and Language Processing
Location:
Gather Area B
Presentation Time:
Thu, 12 May, 22:00 - 22:45 China Time (UTC +8)
Thu, 12 May, 14:00 - 14:45 UTC
Thu, 12 May, 14:00 - 14:45 UTC
Session Chair:
Ming Li, Duke Univ.
Session SPE-72
SPE-72.1: TURN-TO-DIARIZE: ONLINE SPEAKER DIARIZATION CONSTRAINED BY TRANSFORMER TRANSDUCER SPEAKER TURN DETECTION
Wei Xia, University of Texas at Dallas, United States of America; Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio Lopez Moreno, Hasim Sak, Google, United States of America
SPE-72.2: TRANSCRIBE-TO-DIARIZE: NEURAL SPEAKER DIARIZATION FOR UNLIMITED NUMBER OF SPEAKERS USING END-TO-END SPEAKER-ATTRIBUTED ASR
Naoyuki Kanda, Xiong Xiao, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka, Microsoft, United States of America
SPE-72.3: A MULTITASK LEARNING FRAMEWORK FOR SPEAKER CHANGE DETECTION WITH CONTENT INFORMATION FROM UNSUPERVISED SPEECH DECOMPOSITION
Hang Su, Danyang Zhao, Long Dang, Xixin Wu, Xunying Liu, Helen Meng, The Chinese University of Hong Kong, Hong Kong; Minglei Li, Huawei Cloud, China
SPE-72.4: ASR-AWARE END-TO-END NEURAL DIARIZATION
Aparna Khare, Eunjung Han, Yuguang Yang, Andreas Stolcke, Amazon, United States of America
SPE-72.5: Reformulating Speaker Diarization as Community Detection With Emphasis On Topological Structure
Siqi Zheng, Hongbin Suo, Alibaba Group, China
SPE-72.6: TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Nithin Rao Koluguri, Taejin Park, Boris Ginsburg, NVIDIA, United States of America