Paper ID | MLR-APPL-IVSMR-1.5 | ||
Paper Title | MULTI-TASK DISTILLATION: TOWARDS MITIGATING THE NEGATIVE TRANSFER IN MULTI-TASK LEARNING | ||
Authors | Ze Meng, Xin Yao, Lifeng Sun, Tsinghua University, China | ||
Session | MLR-APPL-IVSMR-1: Machine learning for image and video sensing, modeling and representation 1 | ||
Location | Area C | ||
Session Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation Time: | Tuesday, 21 September, 13:30 - 15:00 | ||
Presentation | Poster | ||
Topic | Applications of Machine Learning: Machine learning for image & video sensing, modeling, and representation | ||
IEEE Xplore Open Preview | Click here to view in IEEE Xplore | ||
Abstract | In this paper, we propose a top-down mechanism for alleviating the negative transfer in multi-task learning (MTL). MTL aims to learn the general meta-knowledge via sharing inductive bias among tasks for improving the generalization ability. However, there exists a negative transfer problem in MTL, i.e., the performance improvement of a specific task leads to performance degradation on other tasks due to task competition. As a multi-objective optimization problem, MTL usually has a trade-off between the individual performance of different tasks. Inspired by knowledge distillation that transfers knowledge from a teacher model to a student model without significant performance loss, we propose the multi-task distillation to cope with the negative transfer, turning the multi-objective problem into a multi-teacher knowledge distillation problem. Specifically, we first collect task-specific Pareto optimal teacher models and then achieve the high individual performance of each task without a trade-off in the student model by multi-teacher knowledge distillation. Moreover, the multi-task warm-up initialization and the teacher experience pool are proposed to accelerate our method. Extensive experimental results on various benchmark datasets demonstrate that our method outperforms state-of-the-art multi-task learning algorithms and the single-task training baseline. |