Deep Learning for Speech Synthesis |
| Session Type: Invited talk, Discussion, Oral presentation, Poster session |
| Time: Tuesday, December 18, 14:00 - 17:00 |
| Location: Kallirhoe Hall |
| PREDICTING EXPRESSIVE SPEAKING STYLE FROM TEXT IN END-TO-END SPEECH SYNTHESIS |
| Daisy Stanton; Google |
| Yuxuan Wang; Google |
| RJ Ryan; Google |
| A SPECTRALLY WEIGHTED MIXTURE OF LEAST SQUARE ERROR AND WASSERSTEIN DISCRIMINATOR LOSS FOR GENERATIVE SPSS |
| Gilles Degottex; ObEN, Inc. - University of Cambridge |
| Mark Gales; University of Cambridge |
| SCALING AND BIAS CODES FOR MODELING SPEAKER-ADAPTIVE DNN-BASED SPEECH SYNTHESIS SYSTEMS |
| Hieu-Thi Luong; National Institute of Informatics |
| Junichi Yamagishi; National Institute of Informatics |
| HIERARCHICAL RNNS FOR WAVEFORM-LEVEL SPEECH SYNTHESIS |
| Qingyun Dou; University of Cambridge |
| Moquan Wan; University of Cambridge |
| Gilles Degottex; University of Cambridge |
| Zhiyi Ma; University of Cambridge |
| Mark Gales; University of Cambridge |
| PARAMETER GENERATION ALGORITHMS FOR TEXT-TO-SPEECH SYNTHESIS WITH RECURRENT NEURAL NETWORKS |
| Viacheslav Klimkov; Amazon |
| Alexis Moinet; Amazon |
| Adam Nadolski; Amazon |
| Thomas Drugman; Amazon |
| SYNTHETIC-TO-NATURAL SPEECH WAVEFORM CONVERSION USING CYCLE-CONSISTENT ADVERSARIAL NETWORKS |
| Kou Tanaka; NTT corporation |
| Takuhiro Kaneko; NTT corporation |
| Nobukatsu Hojo; NTT corporation |
| Hirokazu Kameoka; NTT corporation |
| IMPROVING UNSUPERVISED STYLE TRANSFER IN END-TO-END SPEECH SYNTHESIS WITH END-TO-END SPEECH RECOGNITION |
| Da-Rong Liu; National Taiwan University |
| Chi-Yu Yang; National Taiwan University |
| Szu-Lin Wu; National Taiwan University |
| Hung-Yi Lee; National Taiwan University |
| MULTI-SCALE ALIGNMENT AND CONTEXTUAL HISTORY FOR ATTENTION MECHANISM IN SEQUENCE-TO-SEQUENCE MODEL |
| Andros Tjandra; Nara Institute of Science and Technology |
| Sakriani Sakti; Nara Institute of Science and Technology |
| Satoshi Nakamura; Nara Institute of Science and Technology |