Session Time:Friday, December 21, 10:00 - 12:00
Presentation Time:Friday, December 21, 10:00 - 12:00
Anna Björk Nikulásdóttir, Jón Guðnason, Reykjavik University, Iceland; Eiríkur Rögnvaldsson, University of Iceland, Iceland
Abstract: This paper describes an Icelandic pronunciation dictionary for speech applications and its processing for use in a text-to-speech system for Icelandic. Cleaning and correction procedures were implemented to create a consistent training set for grapheme-to-phoneme conversion modeling, needed for the automatic extension of the dictionary. Experiments with the original version of the dictionary and the cleaned version described in this paper as training sets for a joint sequence g2p algorithm show a clear benefit of using clean data for training, both in terms of PER and in terms of categories of errors made by the g2p algorithm. The results of the dictionary processing where also used to create an initial version of an open source database for Icelandic speech applications.