SPE-55.1
ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech
Yi Ren, Zhou Zhao, Zhejiang University, China; Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Alibaba Group, China
Session:
Speech Synthesis: Prosody
Track:
Speech and Language Processing
Location:
Gather Area D
Presentation Time:
Wed, 11 May, 21:00 - 21:45 China Time (UTC +8)
Wed, 11 May, 13:00 - 13:45 UTC
Wed, 11 May, 13:00 - 13:45 UTC
Session Chair:
Zhiyong Wu, Tsinghua University
Session SPE-55
SPE-55.1: ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech
Yi Ren, Zhou Zhao, Zhejiang University, China; Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Alibaba Group, China
SPE-55.2: PROSODYSPEECH: TOWARDS ADVANCED PROSODY MODEL FOR NEURAL TEXT-TO-SPEECH
Yuanhao Yi, Lei He, Shifeng Pan, Xi Wang, Yujia Xiao, Microsoft, China
SPE-55.3: HIERARCHICAL PROSODY MODELING AND CONTROL IN NON-AUTOREGRESSIVE PARALLEL NEURAL TTS
Tuomo Raitio, Jiangchuan Li, Shreyas Seshadri, Apple, United States of America
SPE-55.4: DISCOURSE-LEVEL PROSODY MODELING WITH A VARIATIONAL AUTOENCODER FOR NON-AUTOREGRESSIVE EXPRESSIVE SPEECH SYNTHESIS
Ning-Qian Wu, Zhao-Ci Liu, Zhen-Hua Ling, University of Science and Technology of China, China
SPE-55.5: UNSUPERVISED WORD-LEVEL PROSODY TAGGING FOR CONTROLLABLE SPEECH SYNTHESIS
Yiwei Guo, Chenpeng Du, Kai Yu, Shanghai Jiaotong University, China
SPE-55.6: A CHARACTER-LEVEL SPAN-BASED MODEL FOR MANDARIN PROSODIC STRUCTURE PREDICTION
Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Tsinghua University, China; Changbin Chen, Zhongqin Wu, TAL Education Group, China; Helen Meng, The Chinese University of Hong Kong, China