Although convolutional neural networks (CNNs) have achieved great success in computer vision, this work investigates a simpler, convolution-free backbone network use-fid for many dense prediction tasks. For example, with a comparable number of parameters, PVT+RetinaNet achieves 40.4 AP on the COCO dataset, surpassing ResNet50+RetinNet (36.3 AP) by 4.1 absolute AP (see Figure 2).
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
Read the full paper
Access the original peer-reviewed research via OpenAlex.
| Category | 🤖 Artificial Intelligence |
| Published | Oct 01, 2021 |
| Journal | 2021 IEEE/CVF International Conference on Computer Vision (ICCV) |
| Authors | Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song |
| DOI | 10.1109/iccv48922.2021.00061 |
| Citations | 4,679 |
| Source | OpenAlex |