Home / Research Articles Hub / PaLM: Scaling Language Modeling with Pathways
🤖 Artificial Intelligence OpenAlex

PaLM: Scaling Language Modeling with Pathways

📅 Published: April 5, 2022 👤 Aakanksha Chowdhery, Sharan Narang, Jacob Devlin et al. 📖 arXiv (Cornell University) 📊 2,131 citations
AI-Generated Summary

Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
  • 2 We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods.
  • 3 We demonstrate continued benefits of scaling by achieving advanced few-shot learning results on hundreds of language understanding and generation benchmarks.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Apr 5, 2022
Journal arXiv (Cornell University)
DOI 10.48550/arxiv.2204.02311
Citations 2,131
Authors Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra