We present CSWin Transformer, an efficient and effective Transformer-based backbone for general-purpose vision tasks. By further pretraining on the larger dataset ImageNet-21K, we achieve 87.5% Top-1 accuracy on ImageNet-1K and high segmentation performance on ADE20K with 55.7 mIoU.
⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:
Read Full Paper at OpenAlex| Source | OpenAlex |
| Category | 🤖 Artificial Intelligence |
| Published | Jun 1, 2022 |
| Journal | 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
| DOI | 10.1109/cvpr52688.2022.01181 |
| Citations | 1,228 |
| Authors | Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Weiming Zhang, Nenghai Yu |