Paper ID | MMSP-6.2 |
Paper Title |
INDEPENDENT SIGN LANGUAGE RECOGNITION WITH 3D BODY, HANDS, AND FACE RECONSTRUCTION |
Authors |
Agelos Kratimenos, National Technical University of Athens, Greece; Georgios Pavlakos, University of Berkeley, Greece; Petros Maragos, National Technical University of Athens, Greece |
Session | MMSP-6: Human Centric Multimedia 2 |
Location | Gather.Town |
Session Time: | Thursday, 10 June, 14:00 - 14:45 |
Presentation Time: | Thursday, 10 June, 14:00 - 14:45 |
Presentation |
Poster
|
Topic |
Multimedia Signal Processing: Signal Processing for Multimedia Applications |
IEEE Xplore Open Preview |
Click here to view in IEEE Xplore |
Virtual Presentation |
Click here to watch in the Virtual Conference |
Abstract |
Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse information from hand gestures, body features and facial expressions. While many state-of-the-art works have managed to deeply elaborate on these features independently, to the best of our knowledge, no work has adequately combined all three information channels to efficiently recognize Sign Language. In this work, we employ SMPL-X, a contemporary parametric model that enables joint extraction of 3D body shape, face and hands information from a single image. We use this holistic 3D reconstruction for SLR, demonstrating that it leads to higher accuracy than recognition from raw RGB images and their optical flow fed into the state-of-the-art I3D-type network for 3D action recognition and from 2D Openpose skeletons fed into a Recurrent Neural Network. Finally, a set of experiments on the body, face and hand features showed that neglecting any of these, significantly reduces the classification accuracy, proving the importance of jointly modeling body shape, facial expression and hand pose for Sign Language Recognition. |