Presentation # | 8 |
Session: | ASR IV |
Location: | Kallirhoe Hall |
Session Time: | Friday, December 21, 13:30 - 15:30 |
Presentation Time: | Friday, December 21, 13:30 - 15:30 |
Presentation: |
Poster
|
Topic: |
Speech recognition and synthesis: |
Paper Title: |
A K-NEAREST NEIGHBOURS APPROACH TO UNSUPERVISED SPOKEN TERM DISCOVERY |
Authors: |
Alexis Thual, Corentin Dancette, Julien Karadayi, Juan Benjumea, Emmanuel Dupoux, ENS, France |
Abstract: |
Unsupervised spoken term discovery is the task of finding recurrent acoustic patterns in speech without any annotations. Current approaches consists of two steps: (1) discovering similar patterns in speech, and (2) partitioning those pairs of acoustic tokens using graph clustering methods. We propose a new approach for the first step. Previous systems used various approximation algorithms to make the search tractable on large amounts of data. Our approach is based on an optimized k-nearest neighbours (KNN) search coupled with a fixed word embedding algorithm. The results show that the KNN algorithm is robust across languages, consistently outperforms the DTW-based baseline, and is competitive with current state-of-the-art spoken term discovery systems. |