Presentation # | 6 |
Session: | Speaker Recognition/Verification |
Session Time: | Thursday, December 20, 10:00 - 12:00 |
Presentation Time: | Thursday, December 20, 10:00 - 12:00 |
Presentation: |
Poster
|
Topic: |
Speaker/language recognition: |
Paper Title: |
ROLE ANNOTATED SPEECH RECOGNITION FOR CONVERSATIONAL INTERACTIONS |
Authors: |
Nikolaos Flemotomos; University of Southern California | | |
| Zhuohao Chen; University of Southern California | | |
| David Atkins; University of Washington | | |
| Shrikanth Narayanan; University of Southern California | | |
Abstract: |
Speaker Role Recognition (SRR) assigns a specific speaker role to each speaker-homogeneous speech segment in a conversation. Typically, those segments have to be identified first through a diarization step. Additionally, since SRR is usually based on the different linguistic patterns observed between the roles to be recognized, an Automatic Speech Recognition (ASR) system is also indispensable for the task in hand to convert speech to text. In this work we introduce a Role Annotated Speech Recognition (RASR) system which, given a speech signal, outputs a sequence of words annotated with the corresponding speaker roles. Thus, the need of different component modules which are connected in a way that may lead to error propagation is eliminated. We present, analyze, and test our system for the case of two speaker roles to showcase an end-to-end approach for automatic rich transcription with application to clinical dyadic interactions. |