Home / Research Articles Hub / Multimodal Learning With Transformers: A Survey
🤖 Artificial Intelligence OpenAlex

Multimodal Learning With Transformers: A Survey

📅 Published: May 11, 2023 👤 Peng Xu, Xiatian Zhu, David A. Clifton 📖 IEEE Transactions on Pattern Analysis and Machine Intelligence 📊 848 citations
AI-Generated Summary

Transformer is a promising neural network learner, and has achieved great success in various machine learning tasks. This paper presents a comprehensive survey of Transformer techniques oriented at multimodal data.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 Thanks to the recent prevalence of multimodal applications and Big Data, Transformer-based multimodal learning has become a hot topic in AI research.
  • 2 This paper presents a comprehensive survey of Transformer techniques oriented at multimodal data.
  • 3 The main contents of this survey include: (1) a background of multimodal learning, Transformer ecosystem, and the multimodal Big Data era, (2) a systematic review of Vanilla Transformer, Vision Transformer, and multimodal Transformers, from a geometrically topological perspective, (3) a review of multimodal Transformer applications, via two important paradigms, i.e., for multimodal pretraining and for specific multimodal tasks, (4) a summary of the common challenges and designs shared by the multimodal Transformer models and applications, and (5) a discussion of open problems and potential research directions for the community.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published May 11, 2023
Journal IEEE Transactions on Pattern Analysis and Machine Intelligence
DOI 10.1109/tpami.2023.3275156
Citations 848
Authors Peng Xu, Xiatian Zhu, David A. Clifton