Presentation # | 7 |
Session: | Spoken Language Understanding |
Location: | Kallirhoe Hall |
Session Time: | Wednesday, December 19, 10:00 - 12:00 |
Presentation Time: | Wednesday, December 19, 10:00 - 12:00 |
Presentation: |
Poster
|
Topic: |
Spoken language understanding: |
Paper Title: |
TOWARD MULTI-FEATURES EMPHASIS SPEECH TRANSLATION: ASSESSMENT OF HUMAN EMPHASIS PRODUCTION AND PERCEPTION WITH SPEECH AND TEXT CLUES |
Authors: |
Quoc Truong Do, Nara Institute of Science and Technology, Japan; Sakriani Sakti, Satoshi Nakamura, Nara Institute of Science and Technology/AIP, Japan |
Abstract: |
Emphasis is an important factor of human speech that helps convey emotion and the essential information of utterances. Recently, studies have been conducted on speech-to-speech translation to preserve the emphasis information from the source language to the target language. However, since different cultures have various ways of expressing emphasis, just considering the acoustic-to-acoustic features emphasis translation may not always reflect the experiences of users. On the other hand, emphasis can be expressed at various levels in both text and speech. But it remains unclear how we communicate emphasis in a different form acoustic/linguistic) with different levels and whether we can perceive the difference between different levels of emphasis or observe the similarity of the same emphasis levels in both text and speech forms. In this paper, we conducted analyses on human perception of emphasis with both speech and text clues through crowd-sourced evaluations. The results indicate that although participants can distinguish among emphasis levels and perceive the same emphasis level between speech and text, many ambiguities still exist at certain emphasis levels. Thus, our result provides insight into what needs to be handled during the emphasis translation process. |