Home / Research Articles Hub / InternImage: Exploring Large-Scale Vision Foundati...
🤖 Artificial Intelligence OpenAlex

InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

📅 Published: June 1, 2023 👤 Wenhai Wang, Jifeng Dai, Zhe Chen et al. 📖 Research Journal 📊 894 citations
AI-Generated Summary

Compared to the great progress of large-scale vision transformers (ViTs) in recent years, large-scale models based on convolutional neural networks (CNNs) are still in an early state. The effectiveness of our model is proven on challenging benchmarks including ImageNet, COCO, andADE20K.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 This work presents a new large-scale CNN-based foundation model, termed InternImage, which can obtain the gain from increasing parameters and training data like ViTs.
  • 2 Different from the recent CNNs that focus on large dense kernels, InternImage takes deformable convolution as the core operator, so that our model not only has the large effective receptive field required for downstream tasks such as detection and segmentation, but also has the adaptive spatial aggregation conditioned by input and task information.
  • 3 As a result, the proposed InternImage reduces the strict inductive bias of traditional CNNs and makes it possible to learn stronger and more robust patterns with large-scale parameters from massive data like ViTs.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Jun 1, 2023
Journal Research Journal
DOI 10.1109/cvpr52729.2023.01385
Citations 894
Authors Wenhai Wang, Jifeng Dai, Zhe Chen, Zhenhang Huang, Zhiqi Li