IEEE ICASSP 2022

2022 IEEE International Conference on Acoustics, Speech and Signal Processing

7-13 May 2022

Virtual (all paper presentations)

22-27 May 2022

Main Venue: Marina Bay Sands Expo & Convention Center, Singapore

27-28 October 2022

Satellite Venue: Crowne Plaza Shenzhen Longgang City Centre, Shenzhen, China

ICASSP 2022

SPE-35.1

ASSEM-VC: REALISTIC VOICE CONVERSION BY ASSEMBLING MODERN SPEECH SYNTHESIS TECHNIQUES

Kang-wook Kim, Junhyeok Lee, Myun-chul Joe, MINDsLab Inc., Korea, Republic of; Seung-won Park, Seoul National University, Korea, Republic of

Session:

Voice Conversion II

Location:

Gather Area D

Presentation Time:

Tue, 10 May, 20:00 - 20:45 China Time (UTC +8)
Tue, 10 May, 12:00 - 12:45 UTC

Session Chair:

Qiong Hu, Google

Resources

View Manuscript

Session SPE-35

SPE-35.1: ASSEM-VC: REALISTIC VOICE CONVERSION BY ASSEMBLING MODERN SPEECH SYNTHESIS TECHNIQUES

Kang-wook Kim, Junhyeok Lee, Myun-chul Joe, MINDsLab Inc., Korea, Republic of; Seung-won Park, Seoul National University, Korea, Republic of

SPE-35.2: MINIMIZING RESIDUALS FOR NATIVE-NONNATIVE VOICE CONVERSION IN A SPARSE, ANCHOR-BASED REPRESENTATION OF SPEECH

Christopher Liberatore, Ricardo Gutierrez-Osuna, Texas A&M University, United States of America

SPE-35.3: IMPROVING RECOGNITION-SYNTHESIS BASED ANY-TO-ONE VOICE CONVERSION WITH CYCLIC TRAINING

Yannian Chen, Zhenhua Ling, National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, P.R.China, China; Lijuan Liu, Yajun Hu, Yuan Jiang, iFLYTEK Research, iFLYTEK Co., Ltd., Hefei, P.R.China, China

SPE-35.4: NVC-NET: END-TO-END ADVERSARIAL VOICE CONVERSION

Bac Nguyen, Fabien Cardinaux, Sony Europe B.V., Germany

SPE-35.5: U-GAT-VC: Unsupervised Generative Attentional Networks for Non-parallel Voice Conversion

Sheng Shi, Yangzhou Du, Jianping Fan, Lenovo Research, China; Jiahao Shao, Tsinghua University, China; Yifei Hao, University of Southern California, China

SPE-35.6: DISENTANGLING CONTENT AND FINE-GRAINED PROSODY INFORMATION VIA HYBRID ASR BOTTLENECK FEATURES FOR VOICE CONVERSION

Xintao Zhao, Changhe Song, Zhiyong Wu, Tsinghua University, China; Feng Liu, Shiyin Kang, Deyi Tuo, Huya Inc, China; Helen Meng, The Chinese University of Hong Kong, China

Contact | Accessibility | Nondiscrimination Policy | IEEE Ethics Reporting | IEEE Privacy Policy | Terms | Signal Processing Society

©2026 IEEE – All rights reserved.

Last updated Last updated 21 May 2022.

Use of this website signifies your agreement to the IEEE Terms and Conditions.

Support: webmaster@2022.ieeeicassp.org Host: https://cmsworldwide.com/