Login Paper Search My Schedule Paper Index Help

My ICIP 2021 Schedule

Note: Your custom schedule will not be saved unless you create a new account or login to an existing account.
  1. Create a login based on your email (takes less than one minute)
  2. Perform 'Paper Search'
  3. Select papers that you desire to save in your personalized schedule
  4. Click on 'My Schedule' to see the current list of selected papers
  5. Click on 'Printable Version' to create a separate window suitable for printing (the header and menu will appear, but will not actually print)

Paper Detail

Paper IDMLR-APPL-IVASR-1.7
Paper Title JOINT LEARNING ON THE HIERARCHY REPRESENTATION FOR FINE-GRAINED HUMAN ACTION RECOGNITION
Authors Mei Chee Leong, Hui Li Tan, Institute for Infocomm Research (I2R), A*STAR, Singapore; Haosong Zhang, Nanyang Technological University, Singapore; Liyuan Li, Institute for Infocomm Research (I2R), A*STAR, Singapore; Feng Lin, Nanyang Technological University, Singapore; Joo-Hwee Lim, Institute for Infocomm Research (I2R), A*STAR, Singapore
SessionMLR-APPL-IVASR-1: Machine learning for image and video analysis, synthesis, and retrieval 1
LocationArea D
Session Time:Monday, 20 September, 13:30 - 15:00
Presentation Time:Monday, 20 September, 13:30 - 15:00
Presentation Poster
Topic Applications of Machine Learning: Machine learning for image & video analysis, synthesis, and retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Abstract Fine-grained human action recognition is a core research topic in computer vision. Inspired by the recently proposed hierarchy representation of fine-grained actions in FineGym and SlowFast network for action recognition, we propose a novel multi-task network which exploits the FineGym hierarchy representation to achieve effective joint learning and prediction for fine-grained human action recognition. The multi-task network consists of three pathways of SlowOnly networks with gradually increased frame rates for events, sets and elements of fine-grained actions, followed by our proposed integration layers for joint learning and prediction. It is a two-stage approach, where it first learns deep feature representation at each hierarchical level, and is followed by feature encoding and fusion for multi-task learning. Our empirical results on the FineGym dataset achieve a new state-of-the-art performance, with 91.80% Top-1 accuracy and 88.46% mean accuracy for element actions, which are 3.40% and 7.26% higher than the previous best results.