My SLT 2018 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.

Create a login based on your email (takes less than one minute)
Perform 'Paper Search'
Select papers that you desire to save in your personalized schedule
Click on 'My Schedule' to see the current list of selected papers
Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Presentation #	5
Session:	Natural Language Processing
Session Time:	Thursday, December 20, 13:30 - 15:30
Presentation Time:	Thursday, December 20, 13:30 - 15:30
Presentation:	Poster
Topic:	Natural language processing:
Paper Title:	WORD SEGMENTATION FROM PHONEME SEQUENCES BASED ON PITMAN-YOR SEMI-MARKOV MODEL EXPLOITING SUBWORD INFORMATION
Authors:	Ryu Takeda; Osaka University
	Kazunori Komatani; Osaka University
	Alexander Rudnicky; Carnegie Mellon University
Abstract:	Word segmentation from phoneme sequences is essential to identify unknown words (out-of-vocabulary; OOV) in spoken dialogues. The Pitman-Yor semi-Markov model (PYSMM) is used for word segmentation that handles dynamic increase in vocabularies. The obtained vocabularies, however, still include meaningless entries due to insufficient cues for phoneme sequences. We focus here on using subword information to capture patterns as ``words.'' We propose 1) a model based on subword $N$-gram and subword estimation using a vocabulary set, and 2) posterior fusion of the results of a PYSMM and our model to take advantage of both. Our experiments showed 1) the potential of using subword information for OOV acquisition, and 2) that our method outperformed the PYSMM by 1.53 and 1.07 in terms of the F-measure of the obtained OOV set for English and Japanese corpora, respectively.