Home / Research Articles Hub / Scaling Instruction-Finetuned Language Models
🤖 Artificial Intelligence OpenAlex

Scaling Instruction-Finetuned Language Models

📅 Published: October 20, 2022 👤 Hyung Won Chung, Le Hou, Shayne Longpre et al. 📖 arXiv (Cornell University) 📊 1,192 citations
AI-Generated Summary

Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. We also publicly release Flan-T5 checkpoints, which achieve strong few-shot performance even compared to much larger models, such as PaLM 62B.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data.
  • 2 We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (PaLM, T5, U-PaLM), prompting setups (zero-shot, few-shot, CoT), and evaluation benchmarks (MMLU, BBH, TyDiQA, MGSM, open-ended generation).
  • 3 For instance, Flan-PaLM 540B instruction-finetuned on 1.8K tasks outperforms PALM 540B by a large margin (+9.4% on average).
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Oct 20, 2022
Journal arXiv (Cornell University)
DOI 10.48550/arxiv.2210.11416
Citations 1,192
Authors Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay