2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDSPE-57.3
Paper Title AUTOMATIC ELICITATION COMPLIANCE FOR SHORT-DURATION SPEECH BASED DEPRESSION DETECTION
Authors Brian Stasak, Zhaocheng Huang, University of New South Wales, Australia; Dale Joachim, Sonde Health, United States; Julien Epps, University of New South Wales, Australia
SessionSPE-57: Speech, Depression and Sleepiness
LocationGather.Town
Session Time:Friday, 11 June, 14:00 - 14:45
Presentation Time:Friday, 11 June, 14:00 - 14:45
Presentation Poster
Topic Speech Processing: [SPE-ANLS] Speech Analysis
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Detecting depression from the voice in naturalistic environments is challenging, particularly for short-duration audio recordings. This enhances the need to interpret and make optimal use of elicited speech. The rapid consonant-vowel syllable combination ‘pataka’ has frequently been selected as a clinical motor-speech task. However, there is significant variability in elicited recordings, which remains to be investigated. In this multi-corpus study of over 25,000 ‘pataka’ utterances, it was discovered that speech landmark-based features were sensitive to the number of ‘pataka’ utterances per recording. This landmark feature sensitivity was newly exploited to automatically estimate ‘pataka’ count and rate, achieving root mean square errors nearly three times lower than chance-level. Leveraging count-rate knowledge of the elicited speech for depression detection, results show that the estimated ‘pataka’ number and rate are important for normalizing evaluative ‘pataka’ speech data. Count and/or rate normalized ‘pataka’ models produced relative reductions in depression classification error of up to 26% compared with non-normalized models.