Abstract
Self-attention mechanism has been a key factor in the recent progress of Vision Transformer (ViT), which enables adaptive feature extraction from glob......
小提示:本篇文献需要登录阅读全文,点击跳转登录