Home / Scientific Research / Attention Is All You Need
🤖 Artificial Intelligence OpenAlex

Attention Is All You Need

📅 August 23, 2025 👤 Ashish Vaswani, Noam Shazeer, Niki Parmar et al. 📖 Research Journal 📊 6,562 citations

🤖 Plain-English Summary

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. On the WMT 2014 English-to-French translation task, our model establishes a new single-model advanced BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature.

🔑 Key Findings

  • The best performing models also connect the encoder and decoder through an attention mechanism.
  • We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely.
  • Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.

💡 Why This Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for software, automation, and scientific discovery.

Read the full paper
Access the original peer-reviewed research via OpenAlex.

View on DOI ↗

📋 Article Details

Category 🤖 Artificial Intelligence
Published Aug 23, 2025
Journal Research Journal
Authors Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones
DOI 10.65215/2q58a426
Citations 6,562
Source OpenAlex

More 🤖 Artificial Intelligence Research