2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper IDIVMSP-24.3
Paper Title ABSOLUTE 3D POSE ESTIMATION AND LENGTH MEASUREMENT OF SEVERELY DEFORMED FISH FROM MONOCULAR VIDEOS IN LONGLINE FISHING
Authors Jie Mei, Jenq-Neng Hwang, University of Washington, United States; Suzanne Romain, Craig Rose, Braden Moore, Kelsey Magrane, Pacific States Marine Fisheries Commission, National Oceanic and Atmospheric Administration, United States
SessionIVMSP-24: Applications 2
LocationGather.Town
Session Time:Thursday, 10 June, 15:30 - 16:15
Presentation Time:Thursday, 10 June, 15:30 - 16:15
Presentation Poster
Topic Image, Video, and Multidimensional Signal Processing: [IVARS] Image & Video Analysis, Synthesis, and Retrieval
IEEE Xplore Open Preview  Click here to view in IEEE Xplore
Virtual Presentation  Click here to watch in the Virtual Conference
Abstract Monocular absolute 3D fish pose estimation allows for efficient fish length measurement in the longline fisheries, where fishes are under severe deformation during the catching process. This task is challenging since it requires locating absolute 3D fish keypoints based on a short monocular video clip. Unlike related works, which either require expensive 3D ground-truth data and/or multiple-view images to provide depth information, or are limited to rigid objects, we propose a novel frame-based method to estimate the absolute 3D fish pose and fish length from a single-view 2D segmentation mask. We first introduce a relative 3D fish template. By minimizing an objective function, our method systematically estimates the relative 3D pose of the target fish and fish 2D keypoints in the image. Finally, with a closed-form solution, the relative 3D fish pose can help locate absolute 3D keypoints, resulting in the frame-based absolute fish length measurement, which is further refined based on the statistical temporal inference for the optimal fish length measurement from the video clip. Our experiments show that this method can accurately estimate the absolute 3D fish pose and further measure the absolute length, even outperforming the state-of-the-art multi-view method.