SPE-19.5
SIG-VC: A SPEAKER INFORMATION GUIDED ZERO-SHOT VOICE CONVERSION SYSTEM FOR BOTH HUMAN BEINGS AND MACHINES
Haozhe Zhang, Zexin Cai, Xiaoyi Qin, Ming Li, Duke Kunshan University, China
Session:
Voice Conversion: Representation
Track:
Speech and Language Processing
Location:
Gather Area D
Presentation Time:
Mon, 9 May, 20:00 - 20:45 China Time (UTC +8)
Mon, 9 May, 12:00 - 12:45 UTC
Mon, 9 May, 12:00 - 12:45 UTC
Session Chair:
Yu Zhang, Google
Session SPE-19
SPE-19.1: DGC-VECTOR: A NEW SPEAKER EMBEDDING FOR ZERO-SHOT VOICE CONVERSION
Ruitong Xiao, South China University of Technology, China; Haitong Zhang, Yue Lin, Netease Games, China
SPE-19.2: S3PRL-VC: OPEN-SOURCE VOICE CONVERSION FRAMEWORK WITH SELF-SUPERVISED SPEECH REPRESENTATIONS
Wen-Chin Huang, Tomoki Hayashi, Tomoki Toda, Nagoya University, Japan; Shu-wen Yang, Hung-yi Lee, National Taiwan University, Taiwan; Shinji Watanabe, Carnegie Mellon University, United States of America
SPE-19.3: TRAINING ROBUST ZERO-SHOT VOICE CONVERSION MODELS WITH SELF-SUPERVISED FEATURES
Trung Dang, Peter Chin, Boston University, United States of America; Dung Tran, Kazuhito Koishida, Microsoft Corp., United States of America
SPE-19.4: A COMPARISON OF DISCRETE AND SOFT SPEECH UNITS FOR IMPROVED VOICE CONVERSION
Benjamin van Niekerk, Matthew Baas, Herman Kamper, Stellenbosch University, South Africa; Marc-André Carbonneau, Julian Zaïdi, Hugo Seuté, Ubisoft, Canada
SPE-19.5: SIG-VC: A SPEAKER INFORMATION GUIDED ZERO-SHOT VOICE CONVERSION SYSTEM FOR BOTH HUMAN BEINGS AND MACHINES
Haozhe Zhang, Zexin Cai, Xiaoyi Qin, Ming Li, Duke Kunshan University, China
SPE-19.6: Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion
Jiachen Lian, UC Berkeley, United States of America; Chunlei Zhang, Dong Yu, Tencent AI Lab, United States of America