SPE-35.1
ASSEM-VC: REALISTIC VOICE CONVERSION BY ASSEMBLING MODERN SPEECH SYNTHESIS TECHNIQUES
Kang-wook Kim, Junhyeok Lee, Myun-chul Joe, MINDsLab Inc., Korea, Republic of; Seung-won Park, Seoul National University, Korea, Republic of
Session:
Voice Conversion II
Track:
Speech and Language Processing
Location:
Gather Area D
Presentation Time:
Tue, 10 May, 20:00 - 20:45 China Time (UTC +8)
Tue, 10 May, 12:00 - 12:45 UTC
Tue, 10 May, 12:00 - 12:45 UTC
Session Chair:
Qiong Hu, Google
Session SPE-35
SPE-35.1: ASSEM-VC: REALISTIC VOICE CONVERSION BY ASSEMBLING MODERN SPEECH SYNTHESIS TECHNIQUES
Kang-wook Kim, Junhyeok Lee, Myun-chul Joe, MINDsLab Inc., Korea, Republic of; Seung-won Park, Seoul National University, Korea, Republic of
SPE-35.2: MINIMIZING RESIDUALS FOR NATIVE-NONNATIVE VOICE CONVERSION IN A SPARSE, ANCHOR-BASED REPRESENTATION OF SPEECH
Christopher Liberatore, Ricardo Gutierrez-Osuna, Texas A&M University, United States of America
SPE-35.3: IMPROVING RECOGNITION-SYNTHESIS BASED ANY-TO-ONE VOICE CONVERSION WITH CYCLIC TRAINING
Yannian Chen, Zhenhua Ling, National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, P.R.China, China; Lijuan Liu, Yajun Hu, Yuan Jiang, iFLYTEK Research, iFLYTEK Co., Ltd., Hefei, P.R.China, China
SPE-35.4: NVC-NET: END-TO-END ADVERSARIAL VOICE CONVERSION
Bac Nguyen, Fabien Cardinaux, Sony Europe B.V., Germany
SPE-35.5: U-GAT-VC: Unsupervised Generative Attentional Networks for Non-parallel Voice Conversion
Sheng Shi, Yangzhou Du, Jianping Fan, Lenovo Research, China; Jiahao Shao, Tsinghua University, China; Yifei Hao, University of Southern California, China
SPE-35.6: DISENTANGLING CONTENT AND FINE-GRAINED PROSODY INFORMATION VIA HYBRID ASR BOTTLENECK FEATURES FOR VOICE CONVERSION
Xintao Zhao, Changhe Song, Zhiyong Wu, Tsinghua University, China; Feng Liu, Shiyin Kang, Deyi Tuo, Huya Inc, China; Helen Meng, The Chinese University of Hong Kong, China