Home / Research Articles Hub / Lost in the Middle: How Language Models Use Long C...
🤖 Artificial Intelligence OpenAlex

Lost in the Middle: How Language Models Use Long Contexts

📅 Published: January 1, 2024 👤 Nelson F. Liu, Kevin Lin, John Hewitt et al. 📖 Transactions of the Association for Computational Linguistics 📊 887 citations
AI-Generated Summary

Abstract While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. In particular, we observe that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models.

⚡ This is an original paraphrased summary — not copied from the abstract. Full paper available at the source link below.

Key Findings
  • 1 We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval.
  • 2 We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do not robustly make use of information in long input contexts.
  • 3 In particular, we observe that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models.
Why It Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

This summary is based on publicly available metadata and abstract. For the full research paper, visit the original source:

Read Full Paper at OpenAlex
More Artificial Intelligence Papers ← Back to Hub 📚 Learning Hub
Article Details
Source OpenAlex
Category 🤖 Artificial Intelligence
Published Jan 1, 2024
Journal Transactions of the Association for Computational Linguistics
DOI 10.1162/tacl_a_00638
Citations 887
Authors Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua