Home / Research Articles Hub / An Empirical Study of Training Self-Supervised Vis...
🤖 Artificial Intelligence OpenAlex

An Empirical Study of Training Self-Supervised Vision Transformers

📅 Published: October 1, 2021 👤 Xinlei Chen, Saining Xie, Kaiming He 📖 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 📊 1,446 citations
AI-Generated Summary

This paper does not describe a novel method. We discuss the currently positive evidence as well as challenges and open questions.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 Instead, it studies a straightforward, incremental, yet must-know baseline given the recent progress in computer vision: self-supervised learning for Vision Transformers (ViT).
  • 2 While the training recipes for standard convolutional networks have been highly mature and robust, the recipes for ViT are yet to be built, especially in the self-supervised scenarios where training becomes more challenging.
  • 3 In this work, we go back to basics and investigate the effects of several fundamental components for training self-supervised ViT.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Oct 1, 2021
Journal 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
DOI 10.1109/iccv48922.2021.00950
Citations 1,446
Authors Xinlei Chen, Saining Xie, Kaiming He