SPE-82.3
Towards End-to-End Speaker Diarization with Generalized Neural Speaker Clustering
Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu, Tencent AI Lab, United States of America; Jiatong Shi, Carnegie Mellon University, United States of America
Session:
Speaker Diarization II
Track:
Speech and Language Processing
Location:
Gather Area B
Presentation Time:
Fri, 13 May, 21:00 - 21:45 China Time (UTC +8)
Fri, 13 May, 13:00 - 13:45 UTC
Fri, 13 May, 13:00 - 13:45 UTC
Session Chair:
Paola Garcia, JHU
Session SPE-82
SPE-82.1: INCORPORATING END-TO-END FRAMEWORK INTO TARGET-SPEAKER VOICE ACTIVITY DETECTION
Weiqing Wang, Duke University, United States of America; Ming Li, Duke Kunshan University, China
SPE-82.2: MULTI-SCALE SPEAKER EMBEDDING-BASED GRAPH ATTENTION NETWORKS FOR SPEAKER DIARISATION
Youngki Kwon, Hee-Soo Heo, Jee-weon Jung, You Jin Kim, Bong-Jin Lee, Naver Corporation, Korea, Republic of; Joon Son Chung, Korea Advanced Institute of Science and Technology, Korea, Republic of
SPE-82.3: Towards End-to-End Speaker Diarization with Generalized Neural Speaker Clustering
Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu, Tencent AI Lab, United States of America; Jiatong Shi, Carnegie Mellon University, United States of America
SPE-82.4: AUXILIARY LOSS OF TRANSFORMER WITH RESIDUAL CONNECTION FOR END-TO-END SPEAKER DIARIZATION
Yechan Yu, Dongkeon Park, Hong Kook Kim, Gwangju Institute of Science and Technology, Korea, Republic of
SPE-82.5: Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model
Keisuke Kinoshita, Marc Delcroix, Tomoharu Iwata, NTT, Japan
SPE-82.6: IMPROVING SEPARATION-BASED SPEAKER DIARIZATION VIA ITERATIVE MODEL REFINEMENT AND SPEAKER EMBEDDING BASED POST-PROCESSING
Shu-Tong Niu, Jun Du, University of Science and Technology of China, China; Lei Sun, iFlytek Research, China; Chin-Hui Lee, Georgia Institute of Technology, United States of America