SPE-66.2
VOICE FILTER: FEW-SHOT TEXT-TO-SPEECH SPEAKER ADAPTATION USING VOICE CONVERSION AS A POST-PROCESSING MODULE
Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba, Amazon, Poland; Chung-Ming Chien, National Taiwan University (NTU), United Kingdom of Great Britain and Northern Ireland
Session:
Speech Synthesis: Style control, Transfer, and Adaptation
Track:
Speech and Language Processing
Location:
Gather Area D
Presentation Time:
Thu, 12 May, 20:00 - 20:45 China Time (UTC +8)
Thu, 12 May, 12:00 - 12:45 UTC
Thu, 12 May, 12:00 - 12:45 UTC
Session Chair:
Jiangyan Yi, Institute of Automation, CAS
Session SPE-66
SPE-66.1: SPEAKER GENERATION
Daisy Stanton, Matt Shannon, Soroosh Mariooryad, RJ Skerry-Ryan, Eric Battenberg, Tom Bagby, David Kao, Google, United States of America
SPE-66.2: VOICE FILTER: FEW-SHOT TEXT-TO-SPEECH SPEAKER ADAPTATION USING VOICE CONVERSION AS A POST-PROCESSING MODULE
Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba, Amazon, Poland; Chung-Ming Chien, National Taiwan University (NTU), United Kingdom of Great Britain and Northern Ireland
SPE-66.3: FINE-GRAINED STYLE CONTROL IN TRANSFORMER-BASED TEXT-TO-SPEECH SYNTHESIS
Li-Wei Chen, Alexander Rudnicky, Carnegie Mellon University, United States of America
SPE-66.4: USING MULTIPLE REFERENCE AUDIOS AND STYLE EMBEDDING CONSTRAINTS FOR SPEECH SYNTHESIS
Cheng Gong, Longbiao Wang, Tianjin University, China; Zhenhua Ling, University of Science and Technology of China, China; Ju Zhang, Huiyan Technology (Tianjin) Co., Ltd, China; Jianwu Dang, Japan Advanced Institute of Science and Technology, Japan
SPE-66.5: ENHANCING SPEAKING STYLES IN CONVERSATIONAL TEXT-TO-SPEECH SYNTHESIS WITH GRAPH-BASED MULTI-MODAL CONTEXT MODELING
Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Tsinghua University, China; Helen Meng, The Chinese University of Hong Kong, China; Chao Weng, Dan Su, Tencent, China
SPE-66.6: TOWARDS EXPRESSIVE SPEAKING STYLE MODELLING WITH HIERARCHICAL CONTEXT INFORMATION FOR MANDARIN SPEECH SYNTHESIS
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Tsinghua University, China; Shiyin Kang, Huya Inc., China; Helen Meng, The Chinese University of Hong Kong, Hong Kong