This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. The hierarchical design and the shifted window approach also prove beneficial for all-MLP architectures.
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
Read the full paper
Access the original peer-reviewed research via OpenAlex.
| Category | 🤖 Artificial Intelligence |
| Published | Oct 01, 2021 |
| Journal | 2021 IEEE/CVF International Conference on Computer Vision (ICCV) |
| Authors | Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei |
| DOI | 10.1109/iccv48922.2021.00986 |
| Citations | 29,920 |
| Source | OpenAlex |