Abstract
With labeled data, self distillation (SD) has been proposed to develop compact but effective models without a complex teacher model available in advan......
小提示:本篇文献需要登录阅读全文,点击跳转登录