Technical Program

Paper Detail

Presentation #11
Session:Corpora and Evaluation Methodologies
Location:Kallirhoe Hall
Session Time:Wednesday, December 19, 13:30 - 15:30
Presentation Time:Wednesday, December 19, 13:30 - 15:30
Presentation: Poster
Topic: Evaluation methodologies: Healthcare:
Paper Title: QUERYING DEPRESSION VLOGS
Authors: Joana Correia, Carnegie Mellon University / INESC-ID, Portugal; Isabel Trancoso, INESC-ID / IST, Portugal; Bhiksha Raj, Carnegie Mellon University, Portugal
Abstract: Speech based diagnosis-aid tools for depression typically depend on few and small datasets, that are expensive to collect. The limited availability of training data poses a limitation to the quality that these systems can achieve. An unexplored alternative for large scale source of data are vlogs collected from online multimedia repositories. Along with the automation of the mining process, it is necessary to automate the labeling process too. In this work, we propose a framework to automatically label a corpus of in-the-wild vlogs of possibly depressed subjects, and we estimate the quality of the predicted labels, without ever having access to a ground truth for the majority of the corpus. The framework uses a small subset to train a model and estimate the labels for the remainder of the corpus. Then, using the predicted labels, we train a noisy model and attempt to reconstruct the labels of the original labeled subset. We hypothesize that the quality of the estimated labels for the unlabelled subset of the corpus is correlated to the quality of the label reconstruction of the labeled subset. The results of the bi-modal experiment using in-the-wild data are compared with the ones obtained using controlled data.