Making language models bigger does not inherently make them better at following a user's intent. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets.
This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.
Read the full paper
Access the original peer-reviewed research via OpenAlex.
| Category | 🤖 Artificial Intelligence |
| Published | Mar 04, 2022 |
| Journal | arXiv (Cornell University) |
| Authors | Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright |
| DOI | 10.48550/arxiv.2203.02155 |
| Citations | 4,287 |
| Source | OpenAlex |