Home / Research Library / How Does ChatGPT Perform on the United States Medi...
🤖 Artificial Intelligence OpenAlex

How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment

📅 February 8, 2023 👤 Aidan Gilson, Conrad Safranek, Thomas Huang et al. 📖 JMIR Medical Education 📊 2,014 citations

🤖 Plain-English Summary

BACKGROUND: Chat Generative Pre-trained Transformer (ChatGPT) is a 175-billion-parameter natural language processing model that can generate conversation-style responses to user input. Additionally, we highlight ChatGPT's capacity to provide logic and informational context across the majority of answers.

🔑 Key Findings

  • OBJECTIVE: This study aimed to evaluate the performance of ChatGPT on questions within the scope of the United States Medical Licensing Examination (USMLE) Step 1 and Step 2 exams, as well as to analyze responses for user interpretability.
  • METHODS: We used 2 sets of multiple-choice questions to evaluate ChatGPT's performance, each with questions pertaining to Step 1 and Step 2.
  • The first set was derived from AMBOSS, a commonly used question bank for medical students, which also provides statistics on question difficulty and the performance on an exam relative to the user base.

💡 Why This Matters

This research advances how AI systems learn, reason, and solve problems — with direct implications for automation and scientific discovery.

Read the full paper
Access the original peer-reviewed research via OpenAlex.

View on DOI ↗

📋 Article Details

Category 🤖 Artificial Intelligence
Published Feb 08, 2023
Journal JMIR Medical Education
Authors Aidan Gilson, Conrad Safranek, Thomas Huang, Vimig Socrates, Ling Chi
DOI 10.2196/45312
Citations 2,014
Source OpenAlex

More 🤖 Artificial Intelligence Research