Technical Program

Paper Detail

Presentation #12
Session:Voice Conversion and TTS
Location:Kallirhoe Hall
Session Time:Friday, December 21, 10:00 - 12:00
Presentation Time:Friday, December 21, 10:00 - 12:00
Presentation: Poster
Topic: Evaluation methodologies:
Paper Title: MOS NATURALNESS AND THE QUEST FOR HUMAN-LIKE SPEECH
Authors: Sajad Shirali-Shahreza, Gerald Penn, University of Toronto, Canada
Abstract: This paper reconsiders the use of MOS naturalness as an instrument for measuring the quality (vs.\ intelligibility) of speech. We reconsider an earlier proposed alternative, the paired comparison or "AB" test, and present new empirical evidence that this is indeed a better method for evaluating TTS quality. Using this, we evaluate three older TTS systems along with a recent deep-learning approach against native North-American and Indian speech and show that, in fact, TTS had already crossed the threshold of human-like speech synthesis some time ago. This suggests that a systematic reappraisal of the concept of abstract ``naturalness'' of speech is in order.