IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022
  • Virtual (all paper presentations)
22-27 May 2022
  • Main Venue: Marina Bay Sands Expo & Convention Center, Singapore
27-28 October 2022
  • Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022
SPE-55: Speech Synthesis: Prosody
Wed, 11 May, 21:00 - 21:45 China Time (UTC +8)
Wed, 11 May, 13:00 - 13:45 UTC
Location: Gather Area D
Session Chair: Zhiyong Wu, Tsinghua University
Track: Speech and Language Processing

SPE-55.1: ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech

Yi Ren, Zhou Zhao, Zhejiang University, China; Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Alibaba Group, China

SPE-55.2: PROSODYSPEECH: TOWARDS ADVANCED PROSODY MODEL FOR NEURAL TEXT-TO-SPEECH

Yuanhao Yi, Lei He, Shifeng Pan, Xi Wang, Yujia Xiao, Microsoft, China

SPE-55.3: HIERARCHICAL PROSODY MODELING AND CONTROL IN NON-AUTOREGRESSIVE PARALLEL NEURAL TTS

Tuomo Raitio, Jiangchuan Li, Shreyas Seshadri, Apple, United States of America

SPE-55.4: DISCOURSE-LEVEL PROSODY MODELING WITH A VARIATIONAL AUTOENCODER FOR NON-AUTOREGRESSIVE EXPRESSIVE SPEECH SYNTHESIS

Ning-Qian Wu, Zhao-Ci Liu, Zhen-Hua Ling, University of Science and Technology of China, China

SPE-55.5: UNSUPERVISED WORD-LEVEL PROSODY TAGGING FOR CONTROLLABLE SPEECH SYNTHESIS

Yiwei Guo, Chenpeng Du, Kai Yu, Shanghai Jiaotong University, China

SPE-55.6: A CHARACTER-LEVEL SPAN-BASED MODEL FOR MANDARIN PROSODIC STRUCTURE PREDICTION

Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Tsinghua University, China; Changbin Chen, Zhongqin Wu, TAL Education Group, China; Helen Meng, The Chinese University of Hong Kong, China