2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

IEEE Signal Processing Society

Institute of Electrical and Electronics Engineers (IEEE)

2021 IEEE International Conference on Acoustics, Speech and Signal Processing

6-11 June 2021 • Toronto, Ontario, Canada

Extracting Knowledge from Information

Technical Program

Paper Detail

Paper ID	IVMSP-7.3
Paper Title	ADAPTABLE ENSEMBLE DISTILLATION
Authors	Yankai Wang, Dawei Yang, Wei Zhang, Fudan University, China; Zhe Jiang, ARM Ltd., United Kingdom; Wenqiang Zhang, Fudan University, China
Session	IVMSP-7: Machine Learning for Image Processing I
Location	Gather.Town
Session Time:	Wednesday, 09 June, 13:00 - 13:45
Presentation Time:	Wednesday, 09 June, 13:00 - 13:45
Presentation	Poster
Topic	Image, Video, and Multidimensional Signal Processing: [IVCOM] Image & Video Communications
IEEE Xplore Open Preview	Click here to view in IEEE Xplore
Virtual Presentation	Click here to watch in the Virtual Conference
Abstract	Online knowledge distillation (OKD), which simultaneously trains several peer networks to construct a powerful teacher on on-the-fly, has drawn much attention in recent years. OKD is designed to simplify the training procedure of conventional offline distillation. However, the ensemble strategy of existing OKD methods is inflexible and highly relies on random initial- izations. In this paper, we propose Adaptable Ensemble Distil- lation (AED) that inherits the merits of existing OKD methods while overcoming their major drawbacks. The novelty of our AED lies in three aspects: (1) an individual-regulated mech- anism is proposed to flexibly regulate individual model and further generates an online ensemble with strong adaptability; (2) a diversity-aroused loss is designed to explicitly diversify individual models, which enhances the robustness of the en- semble; (3) an empirical distillation technique is adopted to directly promote knowledge transfer in OKD framework. Ex- tensive experiments show that our proposed AED consistently outperforms the existing state-of-the-art OKD methods on various datasets.