Multi-level Logit Distillation

Jin, Y; Wang, JQ; Lin, DH

Wang, JQ (通讯作者),Shanghai AI Lab, Shanghai, Peoples R China.

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023; (): 24276

Abstract

Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher model to a lightweight student model. Mainstream KD methods can be......

Full Text Link