Home / Research Articles Hub / CMT: Convolutional Neural Networks Meet Vision Tra...
🤖 Artificial Intelligence OpenAlex

CMT: Convolutional Neural Networks Meet Vision Transformers

📅 Published: June 1, 2022 👤 Jianyuan Guo, Kai Han, Han Wu et al. 📖 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 📊 850 citations
AI-Generated Summary

Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image. In particular, our CMT-S achieves 83.5% top-1 accuracy on ImageNet, while being 14x and 2x smaller on FLOPs than the existing DeiT and EfficientNet, respectively.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 However, there are still gaps in both performance and computational cost between transformers and existing convolutional neural networks (CNNs).
  • 2 In this paper, we aim to address this issue and develop a network that can outperform not only the canonical transformers, but also the high-performance convolutional models.
  • 3 We propose a new transformer based hybrid network by taking advantage of transformers to capture long-range dependencies, and of CNNs to extract local information.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Jun 1, 2022
Journal 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
DOI 10.1109/cvpr52688.2022.01186
Citations 850
Authors Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Xinghao Chen